Project E335 was an experimental trial done at Iluma’s broiler experimental farm located in the town of Fredonia, Antioquia (Colombia) at 1800m above sea level and an average annual temperature of 20º C. It aimed to evaluate the broilers’ microbiome, gene expression and gut histology response to different Calcium and VitD levels in the broiler’s diet.
The experiment consisted of 8 treatments differing in Ca concentration across the growth stages and also including presence or absence of VitD in the diet.
Microbiome, gut histology and gene expression data were collected from animals at 33 days of age, following treatment from day of hatch. For microbiome samples, gut content was collected from three different sampling locations: Cecum, Ileum and Feces. For gene expression and histology samples, tissue samples were collected from two sampling locations: Cecum and Ileum (refer to diagram).

As seen in table 1, six microbiome samples per treatment were collected in each gut location, except for four samples that are excluded or missing.
| Ileum | Cecum | Feces | |
|---|---|---|---|
| HighCa | 6 | 6 | 6 |
| HighCa+MediumAlphaD3 | 6 | 6 | 5 |
| LowCa | 5 | 6 | 6 |
| LowCa+MediumAlphaD3 | 6 | 6 | 6 |
| MediumHighCa | 6 | 5 | 6 |
| MediumHighCa+MediumAlphaD3 | 6 | 6 | 5 |
| MediumLowCa | 6 | 6 | 5 |
| MediumLowCa+MediumAlphaD3 | 6 | 6 | 6 |
Sequences identified during the sequencing process are grouped in batches of identical sequences so that the number of times that sequences was found can be recorded. This is the abundance of the sequence. The groups of identical sequences are called Operational Taxonomic Units (OTUs) or Amplicon Sequence Variants (ASVs), and they often (but not always) correspond to species or genera of bacteria. In the plot below we can observe the prevalence of OTUs across all samples (the fixed threshold for an OTU to be considered present in a sample is 0.1% relative abundance).
The sequencing process identified more than 1300 unique sequences, and as expected, only a few OTUs are shared by the majority of samples (Fig1). Black percentages in each graph (Fig1) indicate the percentage of OTUs in that site that are present in at least 50% of the samples. As seen in the graph, most are rare taxa detected in only a small portion of samples. It is common to find a small number of OTUs which are dominant in the community, while most others are much less abundant and they are unlikely to be biologically related to any performance or treatment-based effects.

After filtering low-abundance sequences, the remaining data can be explored more easily. We see below the distribution of these taxa at the phylum level.
Next, we observe the distribution of taxa at the family level within each sample location. We can see that while the sample location has a strong impact on distribution, treatment does not affect the distribution of the families at first sight.
In the following interactive graphs we can view the main genera present in the treatments at each location.
Microbiome diversity
Then, we went on to evaluate diversity in the microbiome samples. Microbiome diversity can be assessed through multiple ecological indices that can be divided into two kinds of measures, alpha and beta diversity. Alpha diversity measures the variability of species within a sample while beta diversity accounts for the differences in composition between samples.
Alpha diversity
Alpha diversity analyses showed a higher diversity for cecum compared to the other gut locations, larger microbiome alpha diversity being usually associated to better health status. However, it is also normal for each region of the gut to have different ranges of richness/diversity. The cecal microbiome tends to be very rich, while the small intestines and feces usually have a smaller number of bacterial species.
Further alpha diversity analyses showed a few differences between treatment groups within the cecum samples and within the ileum samples. In the fecal samples there were no differences between treatments regardless of the index used. This is not uncommon; many changes in microbial diversity occur at the level of abundance, rather than richness. In other words, the types of bacteria present in a community may not change very much, but abundances of individual groups of bacteria can shift, generating changes in the ecosystem and representing the real differences between treatments. And this can not always be captured using alpha diversity calculations.




Beta diversity
Hence, we went on to evaluate those differences in bacterial abundances between the samples, this is also known as beta diversity or compositional distance. These distances (In our case Bray Curtis dissimilarities) are often visualized with a method called principal coordinates analysis (PCoA). Each axis represents a combination of features (OTUs) that account for high amounts of variation between samples. The percentage of the differences for which this combination of features accounts is shown on the axis. Samples that are on opposite ends of an axis that accounts for 20% of variability are likely to be more different than samples that are on opposite ends of an axis that only accounts for 5% of the total variability.
These analyses evidenced clear differentiation between samples of different gut locations (pval=0.001), which was biologically expected since different locations of the gut have different dynamics, roles and environmental conditions.
Despite the fact that the differences are not always apparent in the PCoAs, statistical significant differences were also found between the treatments within each location, including significance by calcium level and by AlphaD3 level.
By Treatment:
By Calcium level:
By VitD level:
Microbiome Differential Abundance Analysis
After comparing the grouping of samples by composition distances evidencing general trends in composition between treatments, we went on to look at methods to identify more specific changes. Microbiome data can be filtered, merged, and subset to make specific comparisons between any two groups, and at any taxonomic level. This with the goal to identify features (i.e., species, OTUs, gene families, etc.) that differ in abundance between two groups of samples according to their treatment. We combine here two methods (DESeq and ANCOMBC) by using both, we can identify the features that are identified as significant by both methods giving us higher confidence in the importance of these features.
Comparisons between presence and absence of AlphaD3:
Here we compare two groups: one supplemented with vitD and the other one with out VitD supplementation, with vitD, regardless of calcium level. In the following graphs we can see the features that are differentially abundant in the presence of AlphaD3 compared to the absence of AlphaD3. Hint: when dots are to the left it means these groups are more abundant in presence of AlphaD3. When points are to the right these groups are more prevalent in absence of AlphaD3
We evaluated the physical structures of the intestinal tract tissue to understand the impact of calcium levels on gastrointestinal inflammation and integrity. We employ a scoring system that allows for semi-quantitative analysis of the integrity and inflammatory status of the gut. A score of 0 indicates a normal, healthy gut with no appearance of damage or aberration. A score of 5 for a given metric indicates extreme damage or aberration in the traits being evaluated.
The treatment with recommended levels of Ca (2) with AlphaD3 showed better gastrointestinal tissue health compared to the rest of the groups, particularly those with higher levels of calcium.
Traits evaluated
InflammationSeverity: An overview of changes in the architecture or integrity of the intestines due to inflammation. These changes can be chronic, subacute, and acute. Changes may include ballooning of the crypt, edema, and loss of crypt cell walls. One aspect of the score accounts for the extent of the damage across the various layers of tissue in the intestine.
LymphoidImmune: Infiltration of immune cells into the mucosa or serosa. The presence of immune cells in the mucosa is normal, but most are in or near aggregations of lymphoid tissue called Gut associated Lymphoid Tissue (GALT). High levels of lymphoid cells throughout the mucosa or serosa indicate a subacute adaptive immune response to a recent or current immune challenge. Similarly, growth in the GALT regions can indicate a pronounced antigenic challenge.
MicrobialOrganisms: Normal and abnormal infiltration or attachment of microorganisms into the mucosa. May include assessment of organism type (i.e. parasite, yeast, or bacteria). Some association of microbes to the apical surface is normal. Higher scores indicate infiltration of abnormal microbe types and/or excessive levels.
MucosalIntegrity: Uniformity and consistency of the mucosa at the apical membrane, where absorption occurs. Aberrations in this category include micro-erosion of the microvilli on the apical surface, ulcers, necrosis, and loss of gut-associated lymphoid tissue (GALT). When severe, these changes can include the submucosa level, too.
OverallArchitecture: Morphology and structure of the mucosa. Aberrations would include blunting of villi, loss of mucus-producing goblet cells, and hyperplasia, in which excessive growth of absorptive enterocytes occurs at the expense of other important cell types such as goblet cells and endocrine cells. These aberrations are reparative mechanisms used by the animal, and typically indicate a repeated stress or injury to the GI at this location.
AdditiveScore: Addition of all scores from all animals by treatment group. This is a summary of the overall condition of the intestine. However, it should be viewed with caution, as two groups or animals may have similar additive scores, but very different scores in individual traits. This would mean that the two groups have different underlying pathologies.
The goal of gene expression is to evaluate how the animal host is responding to its environment by altering the levels of various proteins and other compounds in the body in a complex and very controlled manner. When studying gene expression with real-time polymerase chain reaction (PCR), scientists usually investigate changes (increases or decreases) in the expression of a particular gene or set of genes by measuring the abundance of the gene-specific transcript. Here we quantify expression for 3 genes related to inflammation and gut health, as a way to understand how the gut is responding to the gut microbiota and other stimuli. These target genes are: IL-10, IL-1B, and MUC2.
To plot this data we use -log foldchange transformations relative to the control condition (in this case Unchallegend-untreated= negative control), which has all been normalized to your housekeeping gene. The higher the values, the higher the expression of the gene in the treatment compared to the negative control. Significant differences between groups were tested by anova or multiple t-test (ie Tukey’s test), resulting only in some group differences for the expression of IL10 gene in the cecum, and IL1B and IL10 genes in the ileum.
FIGURE 14:

FIGURE 15:

FIGURE 16:

FIGURE 17:

FIGURE 18:

FIGURE 19:

In addition to studying the role of treatments and challenges on SIWA metrics, we are also interested in understanding if there are associations/correlations between traits that might help us better understand the complex interactions in this system.
We begin with Kendall correlations between the top 20 most abundant microbial taxa and the expression of IL-10, IL-1B, and MUC-2. A positive or red correlation indicates that when a taxon is increased, the expression of the correlated gene is also increased. The opposite it true when the correlation is negative (blue). It is important to understand that this correlation does not mean that changes in the microbiome cause changes in gene expression, or vice versa, only that they are moving in the same direction. The asterisk (*) indicates that the correlation is significant.
FIGURE 20:

Histopathology scores are condensed from a 6-point scale (0-5) to a 3-point scale (mild, moderate, severe) in order to facilitate correlations between these scores and other traits of interest. However, due to the unbalanced nature of these scores (see tables below), it can be difficult to accurately assess correlations between these scores and other traits, and any significant correlations should be viewed very carefully.
FIGURE 21:

| InflammationSeverity | LymphoidImmune | MicrobialOrganisms | MucosalIntegrity | OverallArchitecture | |
|---|---|---|---|---|---|
| Mild | 0 | 44 | 24 | 0 | 1 |
| Mod | 40 | 3 | 23 | 32 | 35 |
| Sev | 7 | 0 | 0 | 15 | 11 |
FIGURE 22:

| InflammationSeverity | LymphoidImmune | MicrobialOrganisms | MucosalIntegrity | OverallArchitecture | |
|---|---|---|---|---|---|
| Mild | 20 | 46 | 46 | 0 | 23 |
| Mod | 26 | 0 | 0 | 46 | 23 |
| Sev | 0 | 0 | 0 | 0 | 0 |
Similar to the previous tab, these graphs explore the relationship between histopathology scores and another trait of interest, gene expression. Again, due to imbalances in the dataset, view any changes seen here with caution.
FIGURE 23:
Cecum
FIGURE 24:
Ileum
Below, you will find linear regressions between single traits (alpha diversity or log ratios of bacteria) and gene expression and histopathology. A regression score is calculated (R-squared), and a line is plotted to show the relationship between the traits. A slope from high to low indicates a negative relationship between the traits, while low-high indicates a positive relationship. The R-squared value suggests the size of the effect, and the tables below show the calculated p-values for each regression analysis.
Diversity indexes:
Observed features = represents richness in the sample, defined as the number of different species present in it.
Shannon diversity = This index measures the homogeneity in abundance of the different species in a sample. In other words, shannon index is an estimate of how complex the community is, both the number of different bacteria, and also how different these bacteria are in their function or genetics.
SIWA Ratios:
These ratios each use two major categories of bacteria in an attempt to simplify the complex microbial community into a single value that can tell us something about the state of the community. These values should be interpreted carefully, as they ignore a lot of information and do not tell a complete story. However, they may be correlated with other variables of interest in a useful way.
SIWA Ratio 1= Lactobacillus/ Escherichia-Shigella
Lactobacillus species are overwhelmingly beneficial to the host, while E coli species include commensal strains as well as opportunistic and pathogenic strains. A comparison of the two summarizes the load of both beneficial and potentially pathogenic microbes in the small intestine. A higher ratio of Lactobacillus to Escherichia-Shigella would be expected in healthier animals.
SIWA Ratio 2= Firmicutes/ Proteobacteria
Firmicutes include both beneficial (Lactobacillus, Bacillus, Ruminococcus, Lachnospiraceae, and Pediococcus) and pathogenic (Clostridium, Streptococcus, Staphylococcus, and Listeria) groups. Proteobacteria include pathogenic groups such as E coli, Salmonella, Shigella, Legionella, Vibrio and Pseudomonas. While Firmicutes may contain pathogenic bacteria as well as beneficial ones, a higher ratio of Firmicutes:Proteobacteria would be expected to be associated with better health.
SIWA Ratio 3= Lactobacillus/rest of the genera
Lactobacillus can account for as much as 90% of sequenced bacteria in the ileum of chickens. By tracking this ratio, we can understand if higher or lower levels of Lactobacillus are correlated with health and performance outcomes.
Linear regressions between alpha diversity indexes and gene expression

| R2 | Pval | |
|---|---|---|
| IL10_Observed | 0.0214985 | 0.3364969 |
| IL10_Shannon | 0.0139546 | 0.4396113 |
| IL1B_Observed | 0.0438499 | 0.1625042 |
| IL1B_Shannon | 0.0658108 | 0.0852579 |
| MUC2_Observed | 0.0214081 | 0.3319059 |
| MUC2_Shannon | 0.0029304 | 0.7208617 |

| R2 | Pval | |
|---|---|---|
| IL10_Observed | 0.0008211 | 0.8500625 |
| IL10_Shannon | 0.0000545 | 0.9625288 |
| IL1B_Observed | 0.0106330 | 0.5105307 |
| IL1B_Shannon | 0.0014544 | 0.8081632 |
| MUC2_Observed | 0.0016404 | 0.7965089 |
| MUC2_Shannon | 0.0181933 | 0.3884838 |
Ratios and gene expression
Linear regressions between microbiome ratios and gene expression

| R2 | Pval | |
|---|---|---|
| IL10_Ratio1 | 0.0356593 | 0.2141176 |
| IL10_Ratio2 | 0.0002783 | 0.9133929 |
| IL10_Ratio3 | 0.0014370 | 0.8047359 |
| IL1B_Ratio1 | 0.0157508 | 0.4059367 |
| IL1B_Ratio2 | 0.0000875 | 0.9508042 |
| IL1B_Ratio3 | 0.0019056 | 0.7733005 |
| MUC2_Ratio1 | 0.0000244 | 0.9740346 |
| MUC2_Ratio2 | 0.0067365 | 0.5876363 |
| MUC2_Ratio3 | 0.0007027 | 0.8611755 |

| R2 | Pval | |
|---|---|---|
| IL10_Ratio1 | 0.0128762 | 0.4687510 |
| IL10_Ratio2 | 0.0000008 | 0.9955077 |
| IL10_Ratio3 | 0.0340554 | 0.2361498 |
| IL1B_Ratio1 | 0.0000735 | 0.9564997 |
| IL1B_Ratio2 | 0.0097357 | 0.5290268 |
| IL1B_Ratio3 | 0.0590705 | 0.1163138 |
| MUC2_Ratio1 | 0.0015714 | 0.8007458 |
| MUC2_Ratio2 | 0.0015948 | 0.7992949 |
| MUC2_Ratio3 | 0.0598046 | 0.1139992 |
Diversity indexes by Histopathology score


Microbiome Ratios by Histopathology score
Cecum

Ileum

---
title: "SIWA REPORT E335"
output:
flexdashboard::flex_dashboard:
output_dir: docs
orientation: rows
source_code: embed
vertical_layout: scroll
css: style.css
mathjax: NULL
self_contained: FALSE
#runtime: shiny
#navbar:
# - { title: "About SIWA", href: "https://siwa.bio/" }
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
```{r libraries, include=FALSE, cache=TRUE}
###load libraries panels tabs####
library(phyloseq) #
library(DESeq2) #
library(kableExtra) #
library(genefilter) #
library(microbiome) #
library(ggplot2) #
library(vegan) #
library(ggpubr) #
library(ggplot2) #
library(plyr)
library(multcompView)
#####Load libraries correlations####
library(ggplot2)
library(phyloseq)
library(stringr)
library(dplyr)
#####Load libraries LR####
library(ggplot2)
library(plyr)
library(dplyr)
library(tidyverse)
library(ape)
library(ggpubr)
####Load libraries categorical histo####
library(kableExtra)
library(plyr)
library("fantaxtic")
####load libraries Ratios-histo boxplots####
library(multcompView)
library(reshape)
### OTRICAS
library(plotly)
library(plyr)
library(flexdashboard)
library(shiny)
library(DT)
## FUNCTIONS
source("/Volumes/GoogleDrive/Mi unidad/SIWAproject/Methods-review/functions.R")
```
```{r All inputs, include=FALSE}
#open phyloseq object
ODLEPobj <- readRDS("/Users/mcadavid/Documents/Reports_review/E335_Maria/phyloseqObject_April11.rds")
#open data-table
## Data creada con create_full_file_for_correlations_to_analytics - Jupyter Notebook
complete_sample_table <-read.table("/Users/mcadavid/Documents/Reports_review/E335_Maria/Subset_exp1/Correlations/performance_histo_ge_ratios_alphadiv_for_correlations.csv",check.names = FALSE, header=T, sep="\t")
#EAFIT surveillance tables (same for all reports)
species_taxonomy_info <-read.csv(file="/Users/mcadavid/Documents/Reports_review/Metabolic_effects_table_V2/species_metabolic_effects.csv",check.names = FALSE, sep=";")
genera_taxonomy_info <- read.csv(file="/Users/mcadavid/Documents/Reports_review/Metabolic_effects_table_V2/genus_metabolic_effects.csv", check.names = FALSE, sep=";")
broad_taxonomy_info <- read.csv(file="/Users/mcadavid/Documents/Reports_review/Metabolic_effects_table_V2/broad_groups_metabolic_effects.csv", check.names = FALSE, sep=";")
```
Project description {data-icon="fa-signal"}
=====================================
Row {data-height=50}
-----------------------------------------------------------------------
Row
-----------------------------------------------------------------------
### {data-width=200}
Project E335 was an experimental trial done at Iluma's broiler experimental farm located in the town of Fredonia, Antioquia (Colombia) at 1800m above sea level and an average annual temperature of 20º C. It aimed to evaluate the broilers' microbiome, gene expression and gut histology response to different Calcium and VitD levels in the broiler's diet.
The experiment consisted of 8 treatments differing in Ca concentration across the growth stages and also including presence or absence of VitD in the diet.
Microbiome, gut histology and gene expression data were collected from animals at 33 days of age, following treatment from day of hatch. For microbiome samples, gut content was collected from three different sampling locations: Cecum, Ileum and Feces. For gene expression and histology samples, tissue samples were collected from two sampling locations: Cecum and Ileum (refer to diagram).
### {data-with=800}
```{r picture, include= TRUE, echo=FALSE, out.width = '100%'}
knitr::include_graphics("diagram_E335.png")
```
# Exploration {data-navmenu="Microbiome"}
```{r Explore and filter Phyloseq Object, include=FALSE}
#Explore Phylosec object
rank_names(ODLEPobj)
sample_variables(ODLEPobj)
otu_table(ODLEPobj)
metadata= meta(ODLEPobj)
colnames(metadata)
colnames(tax_table(ODLEPobj))
ntaxa(ODLEPobj)
#Subset phyloseq object to include only wanted treatments
ODLEPobj <- subset_samples(ODLEPobj, TreatmentNumber%in%c("1","2","3","4","5","6","7","8"))
ODLEPobj = subset_samples(ODLEPobj, SampleID != "0074_02C-M") #REMOVE OUTLIAR
metadata= meta(ODLEPobj)
#Filter phyloseq object by location
ODLEPobj_cecum<-subset_samples(ODLEPobj, SampleLocation=="C")
ODLEPobj_ileum<-subset_samples(ODLEPobj, SampleLocation=="I")
ODLEPobj_feces<-subset_samples(ODLEPobj, SampleLocation=="F")
```
```{r Table of treatments, include=FALSE}
#Table of treatment names and numbers
treatments_names <- unique(metadata$Treatment)
names(treatments_names) <- unique(metadata$TreatmentNumber)
treatments=data.frame(treatments_names,names(treatments_names))
colnames(treatments)=c("Treatment","Treatment Number")
kbl(treatments, caption = "Table 0: Treatment Names and Numbers", row.names = FALSE) %>% kable_styling(position= "left", full_width = F)
```
Row {data-height=50}
-----------------------------------------------------------------------
Row
-----------------------------------------------------------------------
### {data-width=200}
As seen in table 1, six microbiome samples per treatment were collected in each gut location, except for four samples that are excluded or missing.
### TABLE 1 {data-width=800}
```{r Table for number of samples per treatment, echo=FALSE, fig.width = 10}
cec = table(subset(as.data.frame(as.matrix(sample_data(ODLEPobj_cecum))), select = c("Treatment")))
ile = table(subset(as.data.frame(as.matrix(sample_data(ODLEPobj_ileum))), select = c( "Treatment")))
fec = table(subset(as.data.frame(as.matrix(sample_data(ODLEPobj_feces))), select = c("Treatment")))
tab=rbind(ile,cec,fec)
tab <- as.data.frame(t(tab))
colnames(tab)=c("Ileum","Cecum","Feces")
kbl(tab, caption = "Table 1: Samples per Group", centering = FALSE) %>% kable_styling(full_width = T, position= "left")
```
Row {data-height=100}
-----------------------------------------------------------------------
Sequences identified during the sequencing process are grouped in batches of identical sequences so that the number of times that sequences was found can be recorded. This is the abundance of the sequence. The groups of identical sequences are called Operational Taxonomic Units
(OTUs) or Amplicon Sequence Variants (ASVs), and they often (but not always) correspond to species or genera of bacteria. In the plot below we can observe the prevalence of OTUs across all samples (the fixed threshold for an OTU to be considered present in a sample is 0.1% relative abundance).
Row {data-height=400}
-----------------------------------------------------------------------
### {data-width=300}
The sequencing process identified more than 1300 unique sequences, and as expected, only a few OTUs are shared by the majority of samples (Fig1). Black percentages in each graph (Fig1) indicate the percentage of OTUs in that site that are present in at least 50% of the samples. As seen in the graph, most are rare taxa detected in only a small portion of samples. It is common to find a small number of OTUs which are dominant in the community, while most others are much less abundant and they are unlikely to be biologically related to any performance or treatment-based effects.
### FIGURE 1 {data-width=700}
```{r Prevalence Plot, fig.height = 4, fig.width= 5, echo=FALSE}
#ODLEPobj_rel <- microbiome::transform(ODLEPobj, "compositional")
#ntaxa(ODLEPobj_rel)
#OPTIONAL: OTUs of interest to highlight in the plot (add to the list the otu or otus of interest you want to highlight in the plot)
otus_of_interest= list()
#Ileum
pseq.rel_ileum <- microbiome::transform(ODLEPobj_ileum, "compositional")
otu_relative_ileum <- as.data.frame(otu_table(pseq.rel_ileum))
#colSums(otu_relative_ileum) #Check if normalized: sum relative abundance added should be one
#remove OTUs that are zero in all samples
otu_relative_ileum = otu_relative_ileum[!apply(otu_relative_ileum, 1, function(x) all(x == 0)), ]
total_samples= ncol(otu_relative_ileum)
total_otus= nrow(otu_relative_ileum)
#absent in how many samples?
absent=apply(otu_relative_ileum ==0, 1, sum) #podria cambiarlo por un threshold para considerar absent una OTU en una muestra <0.001
#add a col with the percentage of samples it is present
otu_relative_ileum$percentage_samples_present=(1-(absent/total_samples))*100
#add a col with OTU names
otu_relative_ileum$OTU <- row.names(otu_relative_ileum)
#add otus of interest to highlight in the plot
otu_relative_ileum$of_interest <- ifelse(otu_relative_ileum$OTU %in% otus_of_interest == TRUE, "YES", "NO")
#plot
colors <- c("#343aeb","#f56342")
sizes <- c(0.05, 4)
prev_plot_ileum=ggplot(data = otu_relative_ileum) +
theme_classic()+
aes(x = reorder(OTU,-percentage_samples_present,sum), y =percentage_samples_present, color=of_interest, size=of_interest) +
geom_point() +
scale_colour_manual(values=colors) +
scale_size_manual (values=sizes)+
#geom_point(color="#343aeb", size=0.05) +
ggtitle("Ileum") +
#xlab("OTU") + ylab("samples where OTU is present (%)")+
theme(legend.position = "none",
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.title.x=element_blank(),
axis.title.y=element_blank())
#Feces
pseq.rel_feces <- microbiome::transform(ODLEPobj_feces, "compositional")
otu_relative_feces <- as.data.frame(otu_table(pseq.rel_feces))
#colSums(otu_relative_feces) #Check if normalized: sum relative abundance added should be one
#remove OTUs that are zero in all samples
otu_relative_feces = otu_relative_feces[!apply(otu_relative_feces, 1, function(x) all(x == 0)), ]
total_samples= ncol(otu_relative_feces)
total_otus= nrow(otu_relative_feces)
#absent in how many samples?
absent=apply(otu_relative_feces ==0, 1, sum) #podria cambiarlo por un threshold para considerar absent una OTU en una muestra <0.001
#add a col with the percentage of samples it is present
otu_relative_feces$percentage_samples_present=(1-(absent/total_samples))*100
#add a col with OTU names
otu_relative_feces$OTU <- row.names(otu_relative_feces)
#add otus of interest to highlight in the plot
otu_relative_feces$of_interest <- ifelse(otu_relative_feces$OTU %in% otus_of_interest == TRUE, "YES", "NO")
#plot
colors <- c("#16DCC9","#f56342")
sizes <- c(0.05, 4)
prev_plot_feces=ggplot(data = otu_relative_feces) +
theme_classic()+
aes(x = reorder(OTU,-percentage_samples_present,sum), y =percentage_samples_present, color=of_interest, size=of_interest) +
geom_point() +
scale_colour_manual(values=colors) +
scale_size_manual (values=sizes)+
#geom_point(color="#16DCC9", size= 0.05) +
ggtitle("Feces") +
#xlab("OTU") + ylab("samples where OTU is present (%)")+
theme(legend.position = "none",
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.title.x=element_blank(),
axis.title.y=element_blank())
#Cecum
pseq.rel_cecum <- microbiome::transform(ODLEPobj_cecum, "compositional")
otu_relative_cecum <- as.data.frame(otu_table(pseq.rel_cecum))
#colSums(otu_relative_cecum) #Check if normalized: sum relative abundance added should be one
#remove OTUs that are zero in all samples
otu_relative_cecum = otu_relative_cecum[!apply(otu_relative_cecum, 1, function(x) all(x == 0)), ]
total_samples= ncol(otu_relative_cecum)
total_otus= nrow(otu_relative_cecum)
#absent in how many samples?
absent=apply(otu_relative_cecum ==0, 1, sum) #podria cambiarlo por un threshold para considerar absent una OTU en una muestra <0.001
#add a col with the percentage of samples it is present
otu_relative_cecum$percentage_samples_present=(1-(absent/total_samples))*100
#add a col with OTU names
otu_relative_cecum$OTU <- row.names(otu_relative_cecum)
#add otus of interest to highlight in the plot
otu_relative_cecum$of_interest <- ifelse(otu_relative_cecum$OTU %in% otus_of_interest == TRUE, "YES", "NO")
#plot
colors <- c("#BA4FC8","#f56342")
sizes <- c(0.05, 4)
prev_plot_cecum=ggplot(data = otu_relative_cecum) +
theme_classic()+
aes(x = reorder(OTU,-percentage_samples_present,sum), y =percentage_samples_present, color=of_interest, size=of_interest) +
geom_point() +
scale_colour_manual(values=colors) +
scale_size_manual (values=sizes)+
#geom_point(color="#BA4FC8", size=0.05) +
ggtitle("Cecum") +
#xlab("OTU") + ylab("samples where OTU is present (%)")+
theme(legend.position = "none",
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
axis.title.x=element_blank(),
axis.title.y=element_blank())
## Add dased line to each plot
#ileum
otu_relative_ileum_more_50 = otu_relative_ileum[otu_relative_ileum$percentage_samples_present >= 50, c("OTU", "percentage_samples_present")]
more50_ileum = dim(otu_relative_ileum_more_50)[1]
prev_plot_ileum = prev_plot_ileum + geom_segment(aes(x=0, y=50, xend=more50_ileum, yend=50), linetype="dashed", color="black")
prev_plot_ileum = prev_plot_ileum + geom_segment(aes(x=more50_ileum, y=0, xend=more50_ileum, yend=50), linetype="dashed", color="black")
prev_plot_ileum= prev_plot_ileum + annotate("text", x = more50_ileum, y = 50, label = paste0(" ", round(more50_ileum/dim(otu_relative_ileum)[1] * 100, 1), "% OTUs"))
#cecum
otu_relative_cecum_more_50 = otu_relative_cecum[otu_relative_cecum$percentage_samples_present >= 50, c("OTU", "percentage_samples_present")]
more50_cecum = dim(otu_relative_cecum_more_50)[1]
prev_plot_cecum = prev_plot_cecum + geom_segment(aes(x=0, y=50, xend=more50_cecum, yend=50), linetype="dashed", color="black")
prev_plot_cecum = prev_plot_cecum + geom_segment(aes(x=more50_cecum, y=0, xend=more50_cecum, yend=50), linetype="dashed", color="black")
prev_plot_cecum= prev_plot_cecum + annotate("text", x = more50_cecum, y = 50, label = paste0(" ", round(more50_cecum/dim(otu_relative_cecum)[1] * 100, 1), "% OTUs"))
#feces
otu_relative_feces_more_50 = otu_relative_feces[otu_relative_feces$percentage_samples_present >= 50, c("OTU", "percentage_samples_present")]
more50_feces = dim(otu_relative_feces_more_50)[1]
prev_plot_feces = prev_plot_feces + geom_segment(aes(x=0, y=50, xend=more50_feces, yend=50), linetype="dashed", color="black")
prev_plot_feces = prev_plot_feces + geom_segment(aes(x=more50_feces, y=0, xend=more50_feces, yend=50), linetype="dashed", color="black")
prev_plot_feces= prev_plot_feces + annotate("text", x = more50_feces, y = 50, label = paste0(" ", round(more50_feces/dim(otu_relative_feces)[1] * 100, 1), "% OTUs"))
#Aggregted figure (3 locations prevalence graph)
figure <- ggarrange( prev_plot_cecum, prev_plot_ileum , prev_plot_feces,
ncol = 1, nrow = 3,
font.label = list(size = 10, color = "black", face = "plain"))
annotate_figure(figure,
top = text_grob("Prevalence of OTUs in samples", hjust = 0),
bottom = text_grob("OTUs in order of prevalence"),
left = text_grob("Samples in which OTU is present (%)", rot=90))
```
Row {data-height=550}
-----------------------------------------------------------------------
### {data-width=200}
After filtering low-abundance sequences, the remaining data can be explored more easily. We see below the distribution of these taxa at the phylum level.
### FIGURE 2 {data-width=800}
```{r Prevalence Phylum, echo=FALSE, include=TRUE, cache=TRUE}
pseq.rel <- microbiome::transform(ODLEPobj, "compositional")
flist <- filterfun(kOverA(5, 2e-05))
ODLEPobjRelFilter = filter_taxa(pseq.rel, flist, TRUE)
p <- plot_taxa_prevalence(ODLEPobjRelFilter, "Phylum")+
ggtitle("Prevalence of Major Phyla")+
xlab("Log2 abundance")+
ylab("Prevalence of taxa (%)")
ggplotly(p) %>%
layout(
xaxis = list(automargin=TRUE),
yaxis = list(automargin=TRUE)
) %>% partial_bundle()
```
Row {data-height=550}
-----------------------------------------------------------------------
### {data-width=200}
Next, we observe the distribution of taxa at the family level within each sample location. We can see that while the sample location has a strong impact on distribution, treatment does not affect the distribution of the families at first sight.
### FIGURE 3 {data-width=800}
```{r Family distribution, include=FALSE}
#Family ditribution
ps.com.fam <- ranomaly::aggregate_top_taxa(ODLEPobjRelFilter,top = 8, "Family")
#ps.com.fam <- microbiome::aggregate_taxa(ODLEPobjRelFilter, "Family", top= 12)
ps_df <- microbiomeutilities::phy_to_ldf(ps.com.fam, transform.counts = "compositional")
```
```{r Family Plot, echo=FALSE}
plotfam=ggstripchart(ps_df, "SampleLocation", "Abundance",
facet.by = "Family", color = "Treatment",
#size = 5,
ylab= "Relative abundances",
title= "Family distribution by sample locations")
p<- ggpar(plotfam,legend = "right") #+ theme(text=element_text(size=25))
ggplotly(p) %>%
layout(
xaxis = list(automargin=TRUE),
yaxis = list(automargin=TRUE)
) %>% partial_bundle()
```
# Taxonomic composition {data-navmenu="Microbiome"}
Row {data-height=50}
-----------------------------------------------------------------------
Row {data-height=50}
-----------------------------------------------------------------------
In the following interactive graphs we can view the main genera present in the treatments at each location.
```{r Taxonomic composition , include=FALSE, warning=FALSE}
#####Plot Taxonomic composition feces#####
#Aggregate OTUs according to genus
ODLEPobj_cecum_glom = tax_glom(ODLEPobj_cecum, taxrank="Genus", NArm=FALSE)
#create dataframe from phyloseq object
subset.genus.df <- psmelt(ODLEPobj_cecum_glom)
subset.genus.df$genus <- as.character(subset.genus.df$Genus) #convert to character
#calculate median rel. abundance
medians <- ddply(subset.genus.df, ~genus, function(x) c(median=median(x$Abundance)))
#calculate remainder
remainder <- medians[medians$median <= 0.01,]$genus
subset.genus.df[subset.genus.df$genus %in% remainder,]$genus <- "Genera < 1% abund."
#Figures abundances
subset.genus.df <- subset.genus.df %>%
mutate( Treatment=factor(Treatment,levels=c("LowCa", "LowCa+MediumAlphaD3", "MediumLowCa", "MediumLowCa+MediumAlphaD3", "MediumHighCa", "MediumHighCa+MediumAlphaD3", "HighCa","HighCa+MediumAlphaD3")) )
#plot with condensed genera into "< 1% abund" category by Treatment
plot_abundance_cecum<- ggplot(subset.genus.df, aes(x = Treatment, y = Abundance, fill = genus)) +
geom_bar(stat="identity", position="fill") +
labs(title ="Taxonomic composition cecum") +
xlab("Treatment") +
ylab("Relative abundance")+
#stat_compare_means(method = "anova")+
theme(axis.text.x = element_text(angle = 60, hjust = 1))+
theme(text=element_text(size=12))
#####Plot Taxonomic composition ileum#####
#Aggregate OTUs according to genus
ODLEPobj_ileum_glom = tax_glom(ODLEPobj_ileum, taxrank="Genus", NArm=FALSE)
#create dataframe from phyloseq object
subset.genus.df <- psmelt(ODLEPobj_ileum_glom)
subset.genus.df$genus <- as.character(subset.genus.df$Genus) #convert to character
#calculate median rel. abundance
medians <- ddply(subset.genus.df, ~genus, function(x) c(median=median(x$Abundance)))
#calculate remainder
remainder <- medians[medians$median <= 0.01,]$genus
subset.genus.df[subset.genus.df$genus %in% remainder,]$genus <- "Genera < 1% abund."
#Figures abundances
subset.genus.df <- subset.genus.df %>%
mutate( Treatment=factor(Treatment,levels=c("LowCa", "LowCa+MediumAlphaD3", "MediumLowCa", "MediumLowCa+MediumAlphaD3", "MediumHighCa", "MediumHighCa+MediumAlphaD3", "HighCa","HighCa+MediumAlphaD3")) )
#plot with condensed genera into "< 1% abund" category by Treatment
p <- ggplot(data=subset.genus.df, aes(x=reorder(Treatment, KitID), y=Abundance ,fill=genus))
plot_abundance_ileum<- p + geom_bar(stat="identity", position="fill") +
labs(title ="Taxonomic composition ileum") +
xlab("Treatment") +
ylab("Relative abundance")+
#stat_compare_means(method = "anova")+
theme(axis.text.x = element_text(angle = 60, hjust = 1))+
theme(text=element_text(size=12))
#####Plot Taxonomic composition feces#####
#Aggregate OTUs according to genus
ODLEPobj_feces_glom = tax_glom(ODLEPobj_feces, taxrank="Genus", NArm=FALSE)
#create dataframe from phyloseq object
subset.genus.df <- psmelt(ODLEPobj_feces_glom)
subset.genus.df$genus <- as.character(subset.genus.df$Genus) #convert to character
#calculate median rel. abundance
medians <- ddply(subset.genus.df, ~genus, function(x) c(median=median(x$Abundance)))
#calculate remainder
remainder <- medians[medians$median <= 0.01,]$genus
subset.genus.df[subset.genus.df$genus %in% remainder,]$genus <- "Genera < 1% abund."
#Figures abundances
subset.genus.df <- subset.genus.df %>%
mutate( Treatment=factor(Treatment,levels=c("LowCa", "LowCa+MediumAlphaD3", "MediumLowCa", "MediumLowCa+MediumAlphaD3", "MediumHighCa", "MediumHighCa+MediumAlphaD3", "HighCa","HighCa+MediumAlphaD3")) )
#plot with condensed genera into "< 1% abund" category by Treatment
p <- ggplot(data=subset.genus.df, aes(x=Treatment, y=Abundance ,fill=genus))
plot_abundance_feces<- p + geom_bar(stat="identity", position="fill") +
labs(title ="Taxonomic composition feces") +
xlab("Treatment") +
ylab("Relative abundance")+
#stat_compare_means(method = "anova")+
theme(axis.text.x = element_text(angle = 60, hjust = 1))+
theme(text=element_text(size=12))
```
Row {data-height=500}
-----------------------------------------------------------------------
### FIGURE 4A {data-width=1000}
```{r, echo=FALSE, include=TRUE}
ggplotly(plot_abundance_cecum) %>%
layout(
xaxis = list(automargin=TRUE),
yaxis = list(automargin=TRUE)
) %>% partial_bundle()
#ggplotly(plot_abundance_cecum) %>% partial_bundle()
```
Row {data-height=500}
-----------------------------------------------------------------------
### FIGURE 4B {data-width=1000}
```{r, echo=FALSE, include=TRUE}
ggplotly(plot_abundance_ileum) %>% partial_bundle()
```
Row {data-height=500}
-----------------------------------------------------------------------
### FIGURE 4C {data-width=1000}
```{r, echo=FALSE, include=TRUE}
ggplotly(plot_abundance_feces) %>% partial_bundle()
```
# Diversity {data-navmenu="Microbiome"}
Row {data-height=50}
-----------------------------------------------------------------------
Row {data-height=30}
-----------------------------------------------------------------------
**Microbiome diversity**
Row {data-height=50}
-----------------------------------------------------------------------
Then, we went on to evaluate diversity in the microbiome samples. Microbiome diversity can be assessed through multiple ecological indices that can be divided into two kinds of measures, alpha and beta diversity. Alpha diversity measures the variability of species within a sample while beta diversity accounts for the differences in composition between samples.
Row {data-height=50}
-----------------------------------------------------------------------
Row {data-height=30}
-----------------------------------------------------------------------
**Alpha diversity**
Row {data-height=500}
-----------------------------------------------------------------------
### {data-width=350}
Alpha diversity analyses showed a higher diversity for cecum compared to the other gut locations, larger microbiome alpha diversity being usually associated to better health status. However, it is also normal for each region of the gut to have different ranges of richness/diversity. The cecal microbiome tends to be very rich, while the small intestines and feces usually have a smaller number of bacterial species.
Further alpha diversity analyses showed a few differences between treatment groups within the cecum samples and within the ileum samples. In the fecal samples there were no differences between treatments regardless of the index used. This is not uncommon; many changes in microbial diversity occur at the level of abundance, rather than richness. In other words, the types of bacteria present in a community may not change very much, but abundances of individual groups of bacteria can shift, generating changes in the ecosystem and representing the real differences between treatments. And this can not always be captured using alpha diversity calculations.
### FIGURE 5 {data-width=650}
```{r , echo=FALSE}
#alpha diversity comparing locations
pdiv <- plot_anova_diversity_edit(ODLEPobj, method = c("richness", "shannon"),
grouping_column = "SampleLocation", pValueCutoff=0.05) +
xlab(NULL) + ylab(NULL) + theme(legend.position = "none")
pdiv <- annotate_figure(pdiv, top = text_grob("Alpha diversity by sample location", color = "black", face = "bold"))
#ggplotly(pdiv) %>% partial_bundle()
pdiv
```
Row {data-height=400}
-----------------------------------------------------------------------
### FIGURE 6A
```{r , echo=FALSE}
#Alpha diversity with ANOVA separated by location
#Cecum
pc <-plot_anova_diversity_edit(ODLEPobj_cecum, method = c("richness", "shannon"),
grouping_column ="Treatment",
pValueCutoff=0.05)+ xlab(NULL)+ ylab(NULL)+
theme(axis.text.x=element_blank(), axis.ticks.x=element_blank()) +
labs(title ="Cecum") +theme(legend.position = "none",
axis.title.y=element_text(angle = 0, vjust = .5, size = 12)) +
theme(text=element_text(size=10))
#ggplotly(pc) %>% partial_bundle()
pc
```
### FIGURE 6B
```{r , echo=FALSE}
#feces
pf <-plot_anova_diversity_edit(ODLEPobj_feces, method = c("richness", "shannon"),
grouping_column ="Treatment",
pValueCutoff=0.05)+ xlab(NULL)+ ylab(NULL)+
theme(axis.text.x=element_blank(), axis.ticks.x=element_blank()) +
labs(title ="Feces") +theme(legend.position = "none",
axis.title.y=element_text(angle = 0, vjust = .5, size = 12)) +
theme(text=element_text(size=10))
#ggplotly(pf) %>% partial_bundle()
pf
```
### FIGURE 6C
```{r , echo=FALSE}
#ileum
pi <- plot_anova_diversity_edit(ODLEPobj_ileum, method = c("richness", "shannon"),
grouping_column ="Treatment",
pValueCutoff=0.05)+ xlab(NULL)+ ylab(NULL)+
theme(axis.text.x=element_blank(), axis.ticks.x=element_blank()) +
labs(title ="Ileum") +theme(legend.position = "none",
axis.title.y=element_text(angle = 0, vjust = .5, size = 12)) +
theme(text=element_text(size=10))
#ggplotly(pi) %>% partial_bundle()
pi
```
Row {data-height=50}
-----------------------------------------------------------------------
Row {data-height=30}
-----------------------------------------------------------------------
**Beta diversity**
Row {data-height=100}
-----------------------------------------------------------------------
Hence, we went on to evaluate those differences in bacterial abundances between the samples, this is also known as beta diversity or compositional distance. These distances (In our case Bray Curtis dissimilarities) are often visualized with a method called principal coordinates analysis (PCoA). Each axis represents a combination of features (OTUs) that account for high amounts of variation between samples. The percentage of the differences for which this combination of features accounts is shown on the axis. Samples that are on opposite ends of an axis that accounts for 20% of variability are likely to be more different than samples that are on opposite ends of an axis that only accounts for 5% of the total variability.
Row {data-height=80}
-----------------------------------------------------------------------
These analyses evidenced clear differentiation between samples of different gut locations (pval=0.001), which was biologically expected since different locations of the gut have different dynamics, roles and environmental conditions.
```{r , include=FALSE}
bray.dist <- phyloseq::distance(ODLEPobj, method = "bray")
metadata <- as(sample_data(ODLEPobj), "data.frame")
out.bray <- ordinate(ODLEPobj, method = "MDS", distance = "bray")
beta.bray = plot_ordination(ODLEPobj, out.bray, color="SampleLocation" )
group_location <- c("Cecal","Fecal", "Ileum")
colores <- c("#BA4FC8","#16DCC9", "#343aeb")
plot.beta.bray <- beta.bray + geom_point(size=4, alpha=0.75, na.rm=T) + #sobreponer un punto mas grande
scale_colour_manual(values=colores, labels = group_location) + #cambiar colores y poner nombre a cada grupo en leyenda
ggtitle("Dissimilarity between all samples (Bray distances)") +
theme(axis.text.x = element_blank(), axis.text.y = element_blank(), axis.ticks = element_blank())+
theme(text=element_text(size=10))
```
Row {data-height=500}
-----------------------------------------------------------------------
### FIGURE 7 {data-width=1000}
```{r , echo=FALSE}
ggplotly(plot.beta.bray) %>% partial_bundle()
```
Row {data-height=50}
-----------------------------------------------------------------------
Despite the fact that the differences are not always apparent in the PCoAs, statistical significant differences were also found between the treatments within each location, including significance by calcium level and by AlphaD3 level.
Row {data-height=40}
-----------------------------------------------------------------------
By Treatment:
Row {data-height=500}
-----------------------------------------------------------------------
```{r, include=FALSE}
#####Betadiverisity Bray Cecum#####
bray.dist <- phyloseq::distance(ODLEPobj_cecum, method = "bray")
metadata <- as(sample_data(ODLEPobj_cecum), "data.frame")
out.bray <- ordinate(ODLEPobj_cecum, method = "MDS", distance = "bray")
beta.bray <- plot_ordination(ODLEPobj_cecum, out.bray, color="Treatment", axes=c(1, 2))
plot.beta.bray.cecum <- beta.bray + geom_point(size=4, alpha=0.75, na.rm=T) + #sobreponer un punto mas grande
labs(title = "Cecum", col= "Treatment") + #titulo de la grafica y de la leyenda
theme(axis.text.x = element_blank(), axis.text.y = element_blank(), axis.ticks = element_blank(), legend.position="bottom")
#text_grob("Dissimilarity between samples by location (Bray Distance)
#####Betadiverisity Bray Ileum#####
bray.dist <- phyloseq::distance(ODLEPobj_ileum, method = "bray")
out.bray <- ordinate(ODLEPobj_ileum, method = "MDS", distance = "bray")
beta.bray = plot_ordination(ODLEPobj_ileum, out.bray, color="Treatment")
plot.beta.bray.ileum <- beta.bray + geom_point(size=4, alpha=0.75, na.rm=T) + #sobreponer un punto mas grande
labs(title = "Ileum", col= "Treatment") + #titulo de la grafica y de la leyenda
theme(axis.text.x = element_blank(), axis.text.y = element_blank(), axis.ticks = element_blank(), legend.position="bottom")
#####Betadiverisity Bray Feces#####
bray.dist <- phyloseq::distance(ODLEPobj_feces, method = "bray")
out.bray <- ordinate(ODLEPobj_feces, method = "MDS", distance = "bray")
beta.bray <- plot_ordination(ODLEPobj_feces, out.bray, color="Treatment")
plot.beta.bray.feces <- beta.bray + geom_point(size=4, alpha=0.75, na.rm=T) + #sobreponer un punto mas grande
labs(title = "Feces", col= "Treatment") + #titulo de la grafica y de la leyenda
theme(axis.text.x = element_blank(), axis.text.y = element_blank(), axis.ticks = element_blank(), legend.position="bottom")
plot.beta.bray.feces
```
### FIGURE 8A
```{r , echo=FALSE}
ggplotly(plot.beta.bray.cecum)
```
### FIGURE 8B
```{r , echo=FALSE}
ggplotly(plot.beta.bray.ileum)
```
### FIGURE 8C
```{r , echo=FALSE}
ggplotly(plot.beta.bray.feces)
```
Row {data-height=40}
-----------------------------------------------------------------------
By Calcium level:
Row {data-height=500}
-----------------------------------------------------------------------
```{r, include=FALSE}
#####Betadiverisity Bray Cecum#####
out.bray <- ordinate(ODLEPobj_cecum, method = "MDS", distance = "bray")
beta.bray = plot_ordination(ODLEPobj_cecum, out.bray, color="Calcium_level", axes=c(1, 2))
plot.beta.bray.cecum<-beta.bray + geom_point(size=4, alpha=0.75, na.rm=T) + #sobreponer un punto mas grande
labs(title = "Cecum", col= "Calcium_level") + #titulo de la grafica y de la leyenda
theme(axis.text.x = element_blank(), axis.ticks = element_blank())
#####Betadiverisity Bray Ileum#####
out.bray <- ordinate(ODLEPobj_ileum, method = "MDS", distance = "bray")
beta.bray<-plot_ordination(ODLEPobj_ileum, out.bray, color="Calcium_level")
plot.beta.bray.ileum<-beta.bray + geom_point(size=4, alpha=0.75, na.rm=T) + #sobreponer un punto mas grande
labs(title = "Ileum", col= "Calcium_level") + #titulo de la grafica y de la leyenda
theme(axis.text.x = element_blank(), axis.ticks = element_blank())
#####Betadiverisity Bray Feces#####
out.bray <- ordinate(ODLEPobj_feces, method = "MDS", distance = "bray")
beta.bray<-plot_ordination(ODLEPobj_feces, out.bray, color="Calcium_level")
plot.beta.bray.feces<-beta.bray + geom_point(size=4, alpha=0.75, na.rm=T) + #sobreponer un punto mas grande
labs(title = "Feces", col= "Calcium_level") + #titulo de la grafica y de la leyenda
theme(axis.text.x = element_blank(), axis.ticks = element_blank())
```
### FIGURE 9A
```{r , echo=FALSE}
ggplotly(plot.beta.bray.cecum)
```
### FIGURE 9B
```{r , echo=FALSE}
ggplotly(plot.beta.bray.ileum)
```
### FIGURE 9C
```{r , echo=FALSE}
ggplotly(plot.beta.bray.feces)
```
Row {data-height=40}
-----------------------------------------------------------------------
By VitD level:
Row {data-height=500}
-----------------------------------------------------------------------
```{r , echo=FALSE}
#####Betadiverisity Bray Cecum#####
#Ordination beta
out.bray <- ordinate(ODLEPobj_cecum, method = "MDS", distance = "bray")
beta.bray = plot_ordination(ODLEPobj_cecum, out.bray, color="Alphad3level", axes=c(1, 2))
plot.beta.bray.cecum = beta.bray + geom_point(size=4, alpha=0.75, na.rm=T) + #sobreponer un punto mas grande
labs(title = "Cecum", col= "Alphad3level") + #titulo de la grafica y de la leyenda
theme(axis.text.x = element_blank(), axis.ticks = element_blank())
#plot.beta.bray.cecum
#####Betadiverisity Bray Ileum#####
#Ordination beta
out.bray <- ordinate(ODLEPobj_ileum, method = "MDS", distance = "bray")
beta.bray = plot_ordination(ODLEPobj_ileum, out.bray, color="Alphad3level")
plot.beta.bray.ileum = beta.bray + geom_point(size=4, alpha=0.75, na.rm=T) + #sobreponer un punto mas grande
labs(title = "Ileum", col= "Alphad3level") + #titulo de la grafica y de la leyenda
theme(axis.text.x = element_blank(), axis.ticks = element_blank())
#plot.beta.bray.ileum
#####Betadiverisity Bray Feces#####
#Ordination beta
out.bray <- ordinate(ODLEPobj_feces, method = "MDS", distance = "bray")
beta.bray = plot_ordination(ODLEPobj_feces, out.bray, color="Alphad3level")
plot.beta.bray.feces = beta.bray + geom_point(size=4, alpha=0.75, na.rm=T) + #sobreponer un punto mas grande
labs(title = "Feces", col= "Alphad3level") + #titulo de la grafica y de la leyenda
theme(axis.text.x = element_blank(), axis.ticks = element_blank())
#plot.beta.bray.feces
```
### FIGURE 10A
```{r , echo=FALSE}
ggplotly(plot.beta.bray.cecum)
```
### FIGURE 10B
```{r , echo=FALSE}
ggplotly(plot.beta.bray.ileum)
```
### FIGURE 10C
```{r , echo=FALSE}
ggplotly(plot.beta.bray.feces)
```
Row {data-height=30}
-----------------------------------------------------------------------
# Differential Abundance {data-navmenu="Microbiome"}
Row {data-height=50}
-----------------------------------------------------------------------
Row {data-height=30}
-----------------------------------------------------------------------
**Microbiome Differential Abundance Analysis**
Row {data-height=100}
-----------------------------------------------------------------------
After comparing the grouping of samples by composition distances evidencing general trends in composition between treatments, we went on to look at methods to identify more specific changes. Microbiome data can be filtered, merged, and subset to make specific comparisons between any two groups, and at any taxonomic level. This with the goal to identify features (i.e., species, OTUs, gene families, etc.) that differ in abundance between two groups of samples according to their treatment. We combine here two methods (DESeq and ANCOMBC) by using both, we can identify the features that are identified as significant by both methods giving us higher confidence in the importance of these features.
Row {data-height=30}
-----------------------------------------------------------------------
**Comparisons between presence and absence of AlphaD3:**
Row {data-height=50}
-----------------------------------------------------------------------
Here we compare two groups: one supplemented with vitD and the other one with out VitD supplementation, with vitD, regardless of calcium level. In the following graphs we can see the features that are differentially abundant in the presence of AlphaD3 compared to the absence of AlphaD3. Hint: when dots are to the left it means these groups are more abundant in presence of AlphaD3. When points are to the right these groups are more prevalent in absence of AlphaD3
```{r , echo=FALSE}
```
# Metabolic information of species and genera {data-navmenu="Microbiome"}
Row {data-height=3000}
-----------------------------------------------------------------------
#### TABLE 2
```{r sp groups table , echo=FALSE}
genera_taxonomy_info$Species <- "All species"
df= rbind(species_taxonomy_info,genera_taxonomy_info)
df=df[c("Genus","Species","Key_Findings", "Probiotic_potential", "Takeaways")]
colnames(df)[3] <- "Key_Findings"
colnames(df)[4] <- "Probiotic_Potential"
colnames(df)[5] <- "Proven_Traits"
DT::datatable(df,
rownames = FALSE, options = list(pageLength = 10)
)
```
# Metabolic information of brad groups {data-navmenu="Microbiome"}
Row {data-height=3000}
-----------------------------------------------------------------------
#### TABLE 3
```{r broad groups table , echo=FALSE}
df_2=broad_taxonomy_info[c("Group_Name","Finner_Classification", "Key_Findings")]
colnames(df_2)[2] <- "Taxonomic_level"
colnames(df_2)[3] <- "Key_Findings"
DT::datatable(df_2, height = 1000,
rownames = FALSE, options = list(pageLength = 10)
)
```
# Histopathology {data-icon="fa-table"}
#### Histopathology Analysis
We evaluated the physical structures of the intestinal tract tissue to understand the impact of calcium levels on gastrointestinal inflammation and integrity. We employ a scoring system that allows for semi-quantitative analysis of the integrity and inflammatory status of the gut. A score of 0 indicates a normal, healthy gut with no appearance of damage or aberration. A score of 5 for a given metric indicates extreme damage or aberration in the traits being evaluated.
The treatment with recommended levels of Ca (2) with AlphaD3 showed better gastrointestinal tissue health compared to the rest of the groups, particularly those with higher levels of calcium.
**Traits evaluated**
*InflammationSeverity*: An overview of changes in the architecture or integrity of the intestines due to inflammation. These changes can be chronic, subacute, and acute. Changes may include ballooning of the crypt, edema, and loss of crypt cell walls. One aspect of the score accounts for the extent of the damage across the various layers of tissue in the intestine.
*LymphoidImmune*: Infiltration of immune cells into the mucosa or serosa. The presence of immune cells in the mucosa is normal, but most are in or near aggregations of lymphoid tissue called Gut associated Lymphoid Tissue (GALT). High levels of lymphoid cells throughout the mucosa or serosa indicate a subacute adaptive immune response to a recent or current immune challenge. Similarly, growth in the GALT regions can indicate a pronounced antigenic challenge.
*MicrobialOrganisms*: Normal and abnormal infiltration or attachment of microorganisms into the mucosa. May include assessment of organism type (i.e. parasite, yeast, or bacteria). Some association of microbes to the apical surface is normal. Higher scores indicate infiltration of abnormal microbe types and/or excessive levels.
*MucosalIntegrity*: Uniformity and consistency of the mucosa at the apical membrane, where absorption occurs. Aberrations in this category include micro-erosion of the microvilli on the apical surface, ulcers, necrosis, and loss of gut-associated lymphoid tissue (GALT). When severe, these changes can include the submucosa level, too.
*OverallArchitecture*: Morphology and structure of the mucosa. Aberrations would include blunting of villi, loss of mucus-producing goblet cells, and hyperplasia, in which excessive growth of absorptive enterocytes occurs at the expense of other important cell types such as goblet cells and endocrine cells. These aberrations are reparative mechanisms used by the animal, and typically indicate a repeated stress or injury to the GI at this location.
*AdditiveScore*: Addition of all scores from all animals by treatment group. This is a summary of the overall condition of the intestine. However, it should be viewed with caution, as two groups or animals may have similar additive scores, but very different scores in individual traits. This would mean that the two groups have different underlying pathologies.
```{r Histo,fig.width = 10, fig.height = 5, include=FALSE}
histo <- complete_sample_table
#change labels
#Add treatments and treatment numbers (1-8)
histo <- histo %>%
mutate(TreatmentNumber = if_else(KitID == 70, 1,
ifelse(KitID == 71, 2,
ifelse(KitID == 72, 3,
ifelse(KitID == 73, 4,
ifelse(KitID == 74, 5,
ifelse(KitID == 75, 6,
ifelse(KitID == 76, 7,8))))))))
histo <- histo %>%
mutate(Treatment = if_else(KitID == 70, "LowCa",
ifelse(KitID == 71, "LowCa+MediumAlphaD3",
ifelse(KitID == 72, "MediumLowCa",
ifelse(KitID == 73, "MediumLowCa+MediumAlphaD3",
ifelse(KitID == 74, "MediumHighCa",
ifelse(KitID == 75, "MediumHighCa+MediumAlphaD3",
ifelse(KitID == 76, "HighCa",
ifelse(KitID == 77, "HighCa+MediumAlphaD3","other")))))))))
histo <- histo %>%
mutate(Alphalevel = if_else(KitID == 70, "0 VitD",
ifelse(KitID == 71, "VitD",
ifelse(KitID == 72, "0 VitD",
ifelse(KitID == 73, "VitD",
ifelse(KitID == 74, "0 VitD",
ifelse(KitID == 75, "VitD",
ifelse(KitID == 76, "0 VitD","VitD"))))))))
#subset by location
histo_cecum <- histo[histo$SampleLocation == 'C',]
histo_ileum <- histo[histo$SampleLocation == 'I',]
####Anova tests
#cecum AdditiveScore
res.aov <- aov(AdditiveScore ~ Treatment, data = histo_cecum)
summary(res.aov) #Additive score changes significantly with treatment in cecum. pval=0.00232*
#plot(res.aov, 1) # test 1. Homogeneity of variances
#plot(res.aov, 2) # test 2. Normality
tukey <- TukeyHSD(res.aov)
print(tukey)
#letters to add statistics to graph
cld <- multcompLetters4(res.aov, tukey)
cld2 <- data.frame(letters = cld$'Treatment'$Letters) #table with letters per treatment
cld2$Treatment <- rownames(cld2)
names(cld2)[1] <- "histosig"
#cecum other variables
summary(aov(InflammationSeverity ~ Treatment, data = histo_cecum)) #pval= 0.039**
summary(aov(LymphoidImmune ~ Treatment, data = histo_cecum)) #pval= 0.308
summary(aov(MicrobialOrganisms ~ Treatment, data = histo_cecum)) #pval= 0.792
summary(aov(MucosalIntegrity ~ Treatment, data = histo_cecum)) #pval= 0.0714.
summary(aov(OverallArchitecture ~ Treatment, data = histo_cecum)) #pval= 0.0039**
#cecum Scores by level of alphad3
summary(aov(AdditiveScore ~ Alphalevel, data = histo_cecum)) #pval= 0.291
summary(aov(InflammationSeverity ~ Alphalevel, data = histo_cecum)) #pval= 0.25
summary(aov(LymphoidImmune ~ Alphalevel, data = histo_cecum)) #pval= 0.224
summary(aov(MicrobialOrganisms ~ Alphalevel, data = histo_cecum)) #pval= 0.454
summary(aov(MucosalIntegrity ~ Alphalevel, data = histo_cecum)) #pval= 0.655
summary(aov(OverallArchitecture ~ Alphalevel, data = histo_cecum)) #pval= 0.926
#ileum AdditiveScore
res.aov <- aov(AdditiveScore ~ Treatment, data = histo_ileum)
summary(res.aov) #Additive score does not change significantly with treatment in ileum. pval=0.205
tukey <- TukeyHSD(res.aov)
print(tukey)
#letters to add statistics to graph
cld <- multcompLetters4(res.aov, tukey)
cld3 <- data.frame(letters = cld$'Treatment'$Letters) #table with letters per treatment
cld3$Treatment <- rownames(cld3)
names(cld3)[1] <- "histosig"
#no need to do tukey or eval assumptions
#tukey <-TukeyHSD(res.aov) #Only when previous anova has significant pval
#ileum other variables
summary(aov(InflammationSeverity ~ Treatment, data = histo_ileum)) #pval= 0.054.
summary(aov(LymphoidImmune ~ Treatment, data = histo_ileum)) #pval= 0.0181*
summary(aov(MicrobialOrganisms ~ Treatment, data = histo_ileum)) #pval= 0.548
summary(aov(MucosalIntegrity ~ Treatment, data = histo_ileum)) #pval= 0.090
summary(aov(OverallArchitecture ~ Treatment, data = histo_ileum)) #pval= 0.591
#ileum Scores by level of alphad3
summary(aov(AdditiveScore ~ Alphalevel, data = histo_ileum)) #pval= 0.306
summary(aov(InflammationSeverity ~ Alphalevel, data = histo_ileum)) #pval= 0.218
summary(aov(LymphoidImmune ~ Alphalevel, data = histo_ileum)) #pval= 0.0201*
summary(aov(MicrobialOrganisms ~ Alphalevel, data = histo_ileum)) #pval=1
summary(aov(MucosalIntegrity ~ Alphalevel, data = histo_ileum)) #pval= 0.599
summary(aov(OverallArchitecture ~ Alphalevel, data = histo_ileum)) #pval= 0.357
#pivot table to plot
histo_cecum2 <- histo_cecum %>%
pivot_longer(cols = c("OverallArchitecture", "MucosalIntegrity", "LymphoidImmune", "InflammationSeverity", "MicrobialOrganisms"), names_to = "ScoreCategory", values_to = "Value")
histo_ileum2 <- histo_ileum %>%
pivot_longer(cols = c("OverallArchitecture", "MucosalIntegrity", "LymphoidImmune", "InflammationSeverity", "MicrobialOrganisms"), names_to = "ScoreCategory", values_to = "Value")
```
```{r Histo plots by sample,fig.width = 10, fig.height = 5, include=FALSE}
#Stacked bars by SAMPLE
histo_by_samples_cecum <- ggplot(histo_cecum2, aes(fill = ScoreCategory, y =Value , x = SampleID)) +
geom_bar(stat = "identity") +
ggtitle("Histopathology scores by sample (cecum)") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
histo_by_samples_cecum
histo_by_samples_ileum <- ggplot(histo_ileum2, aes(fill = ScoreCategory, y =Value , x = SampleID)) +
geom_bar(stat = "identity") +
ggtitle("Histopathology scores by sample (ileum)") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
histo_by_samples_ileum
```
Row
-----------------------------------------------------------------------
### FIGURE 12 {data-width=500}
```{r Histo plots by treatment-cec,fig.width = 10, fig.height = 8, echo=FALSE}
#Satcked bars by TREATMENT
#cecum
histo_c <- split(histo_cecum2, histo_cecum2$Treatment)
dff <- lapply(histo_c, function(x) ddply(x, .(ScoreCategory),
summarize, mean=mean(Value)))
listDF <- list()
for (i in 1:length(dff)){
#print(i)
dff[[i]]$Treatment <- names(dff)[i]
listDF[[i]] <- dff[[i]]
}
dfff <- do.call(rbind, listDF)
dfff <- dfff %>%
mutate( Treatment=factor(Treatment,levels=c("LowCa", "LowCa+MediumAlphaD3", "MediumLowCa", "MediumLowCa+MediumAlphaD3", "MediumHighCa", "MediumHighCa+MediumAlphaD3", "HighCa","HighCa+MediumAlphaD3")) )
treatments_cecum <- ggplot(dfff, aes(fill = ScoreCategory, y =mean , x =Treatment)) +
ylab("Mean Score") + xlab("") +
ggtitle("Histopathology scores agreggated by Treatment (cecum)") +
geom_bar(stat = "identity") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
theme(text=element_text(size=15))
#Add significancy letters manually
histo_cec_plot <- treatments_cecum+ annotate("text", x = 7, y = 13, label = cld2$histosig[2]) +
annotate("text", x = 6, y = 14, label = cld2$histosig[5]) +
annotate("text", x = 3, y = 14, label = cld2$histosig[3]) +
annotate("text", x = 8, y = 14, label = cld2$histosig[1]) +
annotate("text", x = 4, y = 14, label = cld2$histosig[8]) +
annotate("text", x = 5, y = 14, label = cld2$histosig[4]) +
annotate("text", x = 2, y = 14, label = cld2$histosig[6]) +
annotate("text", x = 1, y = 14, label = cld2$histosig[7])
ggplotly(histo_cec_plot)
```
Row
-----------------------------------------------------------------------
### FIGURE 13 {data-width=500}
```{r Histo plots by treatment-ile,fig.width = 10, fig.height = 8, echo=FALSE}
#ileum
histo_i <- split(histo_ileum2, histo_ileum2$Treatment)
dff <- lapply(histo_i, function(x) ddply(x, .(ScoreCategory),
summarize, mean=mean(Value)))
listDF <- list()
for (i in 1:length(dff)){
#print(i)
dff[[i]]$Treatment <- names(dff)[i]
listDF[[i]] <- dff[[i]]
}
dfff <- do.call(rbind, listDF)
dfff <- dfff %>%
mutate( Treatment=factor(Treatment,levels=c("LowCa", "LowCa+MediumAlphaD3", "MediumLowCa", "MediumLowCa+MediumAlphaD3", "MediumHighCa", "MediumHighCa+MediumAlphaD3", "HighCa","HighCa+MediumAlphaD3")) )
treatments_ileum <- ggplot(dfff, aes(fill = ScoreCategory, y =mean , x = Treatment)) +
ylab("Mean Score") + xlab("") +
ggtitle("Histopathology scores agreggated by Treatment (ileum)") +
geom_bar(stat = "identity") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
theme(text=element_text(size=15))
#Add significancy letters manually
histo_cec_plot <- treatments_ileum+ annotate("text", x = 1:2, y = 7, label = cld3$histosig[1]) +
annotate("text", x = 5, y = 8, label = cld3$histosig[1]) +
annotate("text", x = 7, y = 8, label = cld3$histosig[1]) +
annotate("text", x = 3, y = 8, label = cld3$histosig[1]) +
annotate("text", x = 4, y = 8, label = cld3$histosig[1]) +
annotate("text", x = 6, y = 8, label = cld3$histosig[1]) +
annotate("text", x = 8, y = 8, label = cld3$histosig[1])
ggplotly(histo_cec_plot)
```
Row
-----------------------------------------------------------------------
# Gene Expression {data-icon="fa-table"}
#### Gene Expression Analysis
The goal of gene expression is to evaluate how the animal host is responding to its environment by altering the levels of various proteins and other compounds in the body in a complex and very controlled manner. When studying gene expression with real-time polymerase chain reaction (PCR), scientists usually investigate changes (increases or decreases) in the expression of a particular gene or set of genes by measuring the abundance of the gene-specific transcript. Here we quantify expression for 3 genes related to inflammation and gut health, as a way to understand how the gut is responding to the gut microbiota and other stimuli. These target genes are: IL-10, IL-1B, and MUC2.
To plot this data we use -log foldchange transformations relative to the control condition (in this case Unchallegend-untreated= negative control), which has all been normalized to your housekeeping gene. The higher the values, the higher the expression of the gene in the treatment compared to the negative control. Significant differences between groups were tested by anova or multiple t-test (ie
Tukey’s test), resulting only in some group differences for the expression of IL10 gene in the cecum, and IL1B and IL10 genes in the ileum.
```{r Gene expression ,fig.width = 10, fig.height = 5, include=FALSE}
GENE <- complete_sample_table
#Add treatments and treatment numbers (1-8)
GENE <- GENE %>%
mutate(Treatment = if_else(KitID == 70, "LowCa",
ifelse(KitID == 71, "LowCa+MediumAlphaD3",
ifelse(KitID == 72, "MediumLowCa",
ifelse(KitID == 73, "MediumLowCa+MediumAlphaD3",
ifelse(KitID == 74, "MediumHighCa",
ifelse(KitID == 75, "MediumHighCa+MediumAlphaD3",
ifelse(KitID == 76, "HighCa","HighCa+MediumAlphaD3"))))))))
GENE <- GENE %>%
mutate(TreatmentNumber = if_else(KitID == 70, 1,
ifelse(KitID == 71, 2,
ifelse(KitID == 72, 3,
ifelse(KitID == 73, 4,
ifelse(KitID == 74, 5,
ifelse(KitID == 75, 6,
ifelse(KitID == 76, 7,8))))))))
GENE <- GENE %>%
mutate(Alphalevel = if_else(KitID == 70, "0 VitD",
ifelse(KitID == 71, "VitD",
ifelse(KitID == 72, "0 VitD",
ifelse(KitID == 73, "VitD",
ifelse(KitID == 74, "0 VitD",
ifelse(KitID == 75, "VitD",
ifelse(KitID == 76, "0 VitD","VitD"))))))))
GENE$TreatmentNumber = as.factor(GENE$TreatmentNumber)
GENE$SampleLocation = as.factor(GENE$SampleLocation)
#Table:Samples by treatments
treatments = data.frame(table(GENE$TreatmentNumber))
colnames(treatments)=c("Treatment","Number of samples")
kbl(treatments,row.names = FALSE, align=rep('c', 2)) %>% kable_styling()
#Table:Samples by Samplelocation
location = data.frame(table(GENE$SampleLocation))
colnames(location)=c("Sample Location","Number of samples")
kbl(location,row.names = FALSE, align=rep('c', 2)) %>% kable_styling()
#Subset kits de interés y separar por location
subset <- GENE[GENE$TreatmentNumber %in% c(1,2,3,4,5,6,7,8),]
cec <- subset[subset$SampleLocation == "C",]
il <- subset[subset$SampleLocation == "I",]
#Calcular referencias (en este caso uso el tratamiento 3-MEDIUMLOWCA como mi control para el delta delta)
ref_value_IL10_i <- mean(il[il$TreatmentNumber == "3",]$DeltaCq_IL10)
ref_value_IL1B_i <- mean(il[il$TreatmentNumber == "3",]$DeltaCq_IL1B)
ref_value_MUC2_i <- mean(il[il$TreatmentNumber == "3",]$DeltaCq_MUC2)
ref_value_IL10_c <- mean(cec[cec$TreatmentNumber == "3",]$DeltaCq_IL10)
ref_value_IL1B_c <- mean(cec[cec$TreatmentNumber == "3",]$DeltaCq_IL1B)
ref_value_MUC2_c <- mean(cec[cec$TreatmentNumber == "3",]$DeltaCq_MUC2)
#Cecum calculations deltadelta and rq and new log transformation
cec$DD_IL1B<-cec$DeltaCq_IL1B - ref_value_IL1B_c
cec$RQ_IL1B<- (2^(-cec$DD_IL1B))
cec$RQ_IL1B_log<-(-log(cec$RQ_IL1B))
cec$DD_IL10<-cec$DeltaCq_IL10 - ref_value_IL10_c
cec$RQ_IL10<- 2^(-cec$DD_IL10)
cec$RQ_IL10_log<-(-log(cec$RQ_IL10))
cec$DD_MUC2<-cec$DeltaCq_MUC2 - ref_value_MUC2_c
cec$RQ_MUC2<- (2^(-cec$DD_MUC2))
cec$RQ_MUC2_log<-(-log(cec$RQ_MUC2))
#Ileum calculations deltadelta and rq and new log transformation
il$DD_IL1B<-il$DeltaCq_IL1B - ref_value_IL1B_i
il$RQ_IL1B<- (2^(-il$DD_IL1B))
il$RQ_IL1B_log<-(-log(il$RQ_IL1B))
il$DD_IL10<-il$DeltaCq_IL10 - ref_value_IL10_i
il$RQ_IL10<- 2^(-il$DD_IL10)
il$RQ_IL10_log<-(-log(il$RQ_IL10))
il$DD_MUC2<-il$DeltaCq_MUC2 - ref_value_MUC2_i
il$RQ_MUC2<- (2^(-il$DD_MUC2))
il$RQ_MUC2_log<-(-log(il$RQ_MUC2))
#Statistics for Deltacq values
#(note: when computing the anova with the logtransformed values we get the exact same pvalues)
#compute anova on treatment
res.aov1= aov(DeltaCq_IL1B ~ Treatment, data=cec)
res.aov2= aov(DeltaCq_IL10 ~ Treatment, data=cec)
res.aov3= aov(DeltaCq_MUC2 ~ Treatment, data=cec)
res.aov4= aov(DeltaCq_IL1B ~ Treatment, data=il)
res.aov5= aov(DeltaCq_IL10 ~ Treatment, data=il)
res.aov6= aov(DeltaCq_MUC2 ~ Treatment, data=il)
#summary anova on treatment
summary(res.aov1) #pval= 0.025*
summary(res.aov2) #pval= 0.009**
summary(res.aov3) #pval= 0.787
summary(res.aov4) #pval= 0.00192**
summary(res.aov5) #pval= 0.12
summary(res.aov6) #pval= 0.17
#tukey
tukey1 = TukeyHSD(res.aov1) #*
tukey2 = TukeyHSD(res.aov2) #**
tukey3 = TukeyHSD(res.aov3)
tukey4 = TukeyHSD(res.aov4) #**
tukey5 = TukeyHSD(res.aov5)
tukey6 = TukeyHSD(res.aov6)
#Define letters to add statistics to the graph
cld <- multcompLetters4(res.aov1, tukey1)
IL1B_cec <- data.frame(letters = cld$'Treatment'$Letters)
IL1B_cec$Treatment <- rownames(IL1B_cec)
names(IL1B_cec)[1] <- "IL1B_cec"
cld <- multcompLetters4(res.aov2, tukey2)
IL10_cec <- data.frame(letters = cld$'Treatment'$Letters)
IL10_cec$Treatment <- rownames(IL10_cec)
names(IL10_cec)[1] <- "IL10_cec"
cld <- multcompLetters4(res.aov3, tukey3)
MUC2_cec <- data.frame(letters = cld$'Treatment'$Letters)
MUC2_cec$Treatment <- rownames(MUC2_cec)
names(MUC2_cec)[1] <- "MUC2_cec"
cld <- multcompLetters4(res.aov4, tukey4)
IL1B_ile <- data.frame(letters = cld$'Treatment'$Letters)
IL1B_ile$Treatment <- rownames(IL1B_ile)
names(IL1B_ile)[1] <- "IL1B_ile"
cld <- multcompLetters4(res.aov5, tukey5)
IL10_ile <- data.frame(letters = cld$'Treatment'$Letters)
IL10_ile$Treatment <- rownames(IL10_ile)
names(IL10_ile)[1] <- "IL10_ile"
cld <- multcompLetters4(res.aov6, tukey6)
MUC2_ile <- data.frame(letters = cld$'Treatment'$Letters)
MUC2_ile$Treatment <- rownames(MUC2_ile)
names(MUC2_ile)[1] <- "MUC2_ile"
```
*FIGURE 14:*
```{r Gene expression plots cec il1b ,fig.width = 10, fig.height = 8, echo=FALSE, warning=FALSE}
#PLOTS
#included here only plots for report, all the others plots for deltaqc, by samplesm and others can be found in "E335_subsetexp1_ge.R"
##Plots by TREATMENTS
#violin plot de transformacion -log (tambein puede ser box con cuartiles y mediana), el punto rojo es el mean
#si esta el promedio (punto rojo) por debajo de cero es que esta mas expresado en el control que en el promedio de muestras de ese tratamiento
#cecum (plots with letters for statistics)
meanscec <- aggregate(RQ_IL1B_log ~ Treatment, cec, mean)
cec <- cec %>%
mutate( Treatment=factor(Treatment,levels=c("LowCa", "LowCa+MediumAlphaD3", "MediumLowCa", "MediumLowCa+MediumAlphaD3", "MediumHighCa", "MediumHighCa+MediumAlphaD3", "HighCa","HighCa+MediumAlphaD3")) )
plot1 <- ggplot(cec, aes(x = Treatment, y = RQ_IL1B_log, fill = Treatment)) +
geom_violin() + #geom_boxplot()+ #or geom_violin()
xlab("") +
ylab("Log-fold-change") +
theme(legend.position = "none") +
geom_jitter(color="black", size=0.4, alpha=0.9) +
theme(axis.text.x = element_text(angle = 60, hjust = 1))+
stat_summary(fun=mean, geom="point", size=2, color="red", fill="red")+
geom_text(data = meanscec, aes(label = RQ_IL1B_log, y = RQ_IL1B_log + 0.2),
position = position_dodge(width = 1),vjust = -0.5, size = 2)+
coord_cartesian(ylim = c(-8, 8))+
ggtitle("Cecum IL1B")+
geom_text(data = IL1B_cec, aes(x = Treatment, y = 6, label = IL1B_cec), size = 5) +
theme(text=element_text(size=15))
plot1
```
*FIGURE 15:*
```{r Gene expression plots cec il10 ,fig.width = 10, fig.height = 8, echo=FALSE, warning=FALSE}
meanscec <- aggregate(RQ_IL10_log ~ Treatment, cec, mean)
plot1 <- ggplot(cec, aes(x =Treatment, y = RQ_IL10_log, fill = Treatment)) +
geom_violin() + #geom_boxplot()+ #or geom_violin()
xlab("") +
ylab("Log-fold-change") +
theme(legend.position = "none") +
geom_jitter(color="black", size=0.4, alpha=0.9) +
theme(axis.text.x = element_text(angle = 60, hjust = 1))+
stat_summary(fun=mean, geom="point", size=2, color="red", fill="red")+
geom_text(data = meanscec, aes(label = RQ_IL10_log, y = RQ_IL10_log + 0.2),
position = position_dodge(width = 1),vjust = -0.5, size = 2)+
coord_cartesian(ylim = c(-2.5, 1.5))+
ggtitle("Cecum IL10")+
geom_text(data = IL10_cec, aes(x = Treatment, y = 1, label = IL10_cec), size = 5) +
theme(text=element_text(size=15))
plot1
```
*FIGURE 16:*
```{r Gene expression plots cec muc2 ,fig.width = 10, fig.height = 8, echo=FALSE, warning=FALSE}
meanscec <- aggregate(RQ_MUC2_log ~ Treatment, cec, mean)
plot1 <- ggplot(cec, aes(x = Treatment, y = RQ_MUC2_log, fill = Treatment)) +
geom_violin() + #geom_boxplot()+ #or geom_violin()
xlab("") +
ylab("Log-fold-change") +
theme(legend.position = "none") +
geom_jitter(color="black", size=0.4, alpha=0.9) +
theme(axis.text.x = element_text(angle = 60, hjust = 1))+
stat_summary(fun=mean, geom="point", size=2, color="red", fill="red")+
geom_text(data = meanscec, aes(label = RQ_MUC2_log, y = RQ_MUC2_log + 0.2),
position = position_dodge(width = 1),vjust = -0.5, size = 2)+
coord_cartesian(ylim = c(-6, 6))+
ggtitle("Cecum MUC2")+
geom_text(data = MUC2_cec, aes(x = Treatment, y = 5, label = MUC2_cec), size = 5) +
theme(text=element_text(size=15))
plot1
```
*FIGURE 17:*
```{r Gene expression plots ile il1b ,fig.width = 10, fig.height = 8, echo=FALSE, warning=FALSE}
#ileum (plots with letters for statistics)
meansile <- aggregate(RQ_IL1B_log ~ Treatment, il, mean)
il <- il %>%
mutate( Treatment=factor(Treatment,levels=c("LowCa", "LowCa+MediumAlphaD3", "MediumLowCa", "MediumLowCa+MediumAlphaD3", "MediumHighCa", "MediumHighCa+MediumAlphaD3", "HighCa","HighCa+MediumAlphaD3")) )
plot1 <- ggplot(il, aes(x = Treatment, y = RQ_IL1B_log, fill = Treatment)) +
geom_violin() + #geom_boxplot()+ #or geom_violin()
xlab("") +
ylab("Log-fold-change") +
theme(legend.position = "none") +
geom_jitter(color="black", size=0.4, alpha=0.9) +
theme(axis.text.x = element_text(angle = 60, hjust = 1))+
stat_summary(fun=mean, geom="point", size=2, color="red", fill="red")+
geom_text(data = meansile, aes(label = RQ_IL1B_log, y = RQ_IL1B_log + 0.2),
position = position_dodge(width = 1),vjust = -0.5, size = 2)+
coord_cartesian(ylim = c(-6, 8))+
ggtitle("Ileum IL1B")+
geom_text(data = IL1B_ile, aes(x = Treatment, y = 7, label = IL1B_ile), size = 5) +
theme(text=element_text(size=15))
plot1
```
*FIGURE 18:*
```{r Gene expression plots ile il10 ,fig.width = 10, fig.height = 8, echo=FALSE, warning=FALSE}
meansile <- aggregate(RQ_IL10_log ~ Treatment, il, mean)
plot1 <- ggplot(il, aes(x = Treatment, y = RQ_IL10_log, fill = Treatment)) +
geom_violin() + #geom_boxplot()+ #or geom_violin()
xlab("") +
ylab("Log-fold-change") +
theme(legend.position = "none") +
geom_jitter(color="black", size=0.4, alpha=0.9) +
theme(axis.text.x = element_text(angle = 60, hjust = 1))+
stat_summary(fun=mean, geom="point", size=2, color="red", fill="red")+
geom_text(data = meansile, aes(label = RQ_IL10_log, y = RQ_IL10_log + 0.2),
position = position_dodge(width = 1),vjust = -0.5, size = 2)+
coord_cartesian(ylim = c(-1, 1.6))+
ggtitle("Ileum IL10")+
geom_text(data = IL10_ile, aes(x = Treatment, y = 1.5, label = IL10_ile), size = 5) +
theme(text=element_text(size=15))
plot1
```
*FIGURE 19:*
```{r Gene expression plots ile muc2 ,fig.width = 10, fig.height = 8, echo=FALSE, warning=FALSE}
meansile <- aggregate(RQ_MUC2_log ~ Treatment, il, mean)
plot1 <- ggplot(il, aes(x =Treatment, y = RQ_MUC2_log, fill = Treatment)) +
geom_violin() + #geom_boxplot()+ #or geom_violin()
xlab("") +
ylab("Log-fold-change") +
theme(legend.position = "none") +
geom_jitter(color="black", size=0.4, alpha=0.9) +
theme(axis.text.x = element_text(angle = 60, hjust = 1))+
stat_summary(fun=mean, geom="point", size=2, color="red", fill="red")+
geom_text(data = meansile, aes(label = RQ_MUC2_log, y = RQ_MUC2_log + 0.2),
position = position_dodge(width = 1),vjust = -0.5, size = 2)+
coord_cartesian(ylim = c(-4, 2.3))+
ggtitle("Ileum MUC2")+
geom_text(data = MUC2_ile, aes(x = Treatment, y = 2, label = MUC2_ile), size = 5) +
theme(text=element_text(size=15))
plot1
```
# Microbiome + Gene expression{data-navmenu="Cross-panel Analyses"}
#### Microbiome + Gene expression
##### Correlation microbiome + gene expression
In addition to studying the role of treatments and challenges on SIWA metrics, we are also interested in understanding if there are associations/correlations between traits that might help us better understand the complex interactions in this system.
We begin with Kendall correlations between the top 20 most abundant microbial taxa and the expression of IL-10, IL-1B, and MUC-2. A positive or red correlation indicates that when a taxon is increased, the expression of the correlated gene is also increased. The opposite it true when the correlation is negative (blue). It is important to understand that this correlation does not mean that changes in the microbiome *cause* changes in gene expression, or vice versa, only that they are moving in the same direction. The asterisk (*) indicates that the correlation is significant.
```{r open data 1, include=FALSE}
## Data creada con create_full_file_for_correlations_to_analytics - Jupyter Notebook
#meta with ratios for I and C
meta_exp <- complete_sample_table
#Extract taxonomy and otu table from phyloseq object
taxonomy <-as.data.frame(tax_table(ODLEPobj))
taxonomy$OTU <- row.names(taxonomy)
otu_table <- otu_table(ODLEPobj)
otu_table <- as.data.frame(otu_table)
otu_table$OTU<- row.names(otu_table)
#filter otus and taxonomy by the samples that are present in meta_exp (not needed)
#otu_table <- otu_table[c(meta_exp$SampleID, "OTU")]
#taxonomy <- filter(taxonomy, OTU %in% otu_table$OTU)
### AGGREGATE BY TAXA
otu_table_agg <- aggregate_otus_by_taxa(meta_exp, otu_table, taxonomy, "Family")
#R le pone una X y le quita el guion al sampleid - con check.names = FALSE se arregla. SINO, correr lo sgte:
#names(otu_table)<-sapply(str_remove_all(colnames(otu_table),"X"),"[")
#names(otu_table) <- str_replace_all(colnames(otu_table),'[.]', '-')
otu_table <- otu_table_agg ## para que a partir de aquí sea igual
#Transpose the data to have sample names on rows
otu_table <- as.matrix(otu_table)
otu_table <- t(otu_table)
class(otu_table)
rownames(otu_table) <-
sapply(str_remove_all(rownames(otu_table), "X"), "[") # porsi
rownames(otu_table) #deben ser las samples
```
```{r transformation 1, include=FALSE}
#####transformation and filter for compositional data
rowz<-apply(otu_table == 0, 2, sum)/dim(otu_table)[1] #sum over columns (OTUS)
# dejar otus frecuentes, quitando los que están llenos de CEROS
p <- which(rowz>0.9)
otu_table_filtered <-otu_table[, -p]
# colz <- apply(otu_table_filtered == 0, 1, sum)/dim(otu_table_filtered)[2]
#pp<-which(colz>0.9)
otu_table_filtered<- as.matrix(otu_table_filtered)
require("zCompositions")
require("compositions")
otu_table_filtered_transformed <- cmultRepl(otu_table_filtered)
otu_table_filtered_transformed_log <- clr(otu_table_filtered_transformed)
dim(otu_table_filtered_transformed_log)
#Extract the corresponding meta_table for the samples in abund_table
rownames(meta_exp) <- meta_exp$SampleID
meta_exp <- meta_exp[rownames(otu_table_filtered_transformed_log), ]
rownames(otu_table_filtered_transformed_log)
colnames(meta_exp)
```
```{r choose variables 1, include=FALSE}
#When its ony one variable y just leave the c("") with onw variable.
#You can use sel_env to specify the variables you want to use and sel_env_label to specify the labels for the pannel
sel_vars <-
c(
"DeltaCq_IL10",
"DeltaCq_IL1B",
"DeltaCq_MUC2"
)
## Asi van a aparecer en el plot
sel_vars_label <- list(
"DeltaCq_IL10" = "DeltaCq_IL10",
"DeltaCq_IL1B" = "DeltaCq_IL1B",
"DeltaCq_MUC2" = "DeltaCq_MUC2"
)
sel_vars_label<-t(as.data.frame(sel_vars_label))
sel_vars_label<-as.data.frame(sel_vars_label)
colnames(sel_vars_label)<-c("Trans")
#Now get a filtered table based on sel_env --> DEJAR SOLO SAMPLES CON METADATOS
meta_exp_filtered<-meta_exp[,sel_vars]
meta_exp_filtered <- as.data.frame(meta_exp_filtered)
X <- otu_table_filtered_transformed_log
X <- X[, order(colSums(X), decreasing = TRUE)] #### REVISAAAAR CENTERED LOG RATIO TRANSFORM!!!!
dim(X)
#Extract list of top N Taxa
N<-20
otus_list<-colnames(X)[1:N]
X <- data.frame(X[, colnames(X) %in% otus_list], check.names = FALSE)
Y <- meta_exp_filtered
dim(X)
#Get grouping information
groups_ <- meta_exp$SampleLocation
#### Convertir las variables a numéricas
Y$DeltaCq_IL10 <- as.numeric(Y$DeltaCq_IL10)
Y$DeltaCq_IL1B <- as.numeric(Y$DeltaCq_IL1B)
Y$DeltaCq_MUC2 <- as.numeric(Y$DeltaCq_MUC2 )
```
```{r correlation method 1, include=FALSE}
#Choose correlation method
method<-"kendall"
#method <- "spearman"
#method <- "pearson"
#Now calculate the correlation between individual Taxa and the environmental data
df <- NULL
for (i in colnames(X)) {
for (j in colnames(Y)) {
for (k in c("C", "I")) {
a <- X[groups_ == k, i, drop = F]
b <- Y[groups_ == k, j, drop = F]
tmp <- c(
i,
j,
cor(a[complete.cases(b), ], b[complete.cases(b), ],
use = "everything", method = method),
cor.test(a[complete.cases(b), ], b[complete.cases(b), ], method =
method)$p.value,
k
)
if (is.null(df)) {
df <- tmp
}
else{
df <- rbind(df, tmp)
}
}
}
}
df<-data.frame(row.names=NULL,df)
colnames(df)<-c("Taxa","Env","Correlation","Pvalue","Type")
df$Pvalue<-as.numeric(as.character(df$Pvalue))
df$AdjPvalue<-rep(0,dim(df)[1])
df$Correlation<-as.numeric(as.character(df$Correlation))
adjustment_label<-c("NoAdj","AdjEnvAndType","AdjTaxaAndType","AdjTaxa","AdjEnv")
adjustment<-5
for(i in unique(df$Env)){
sel<-df$Env==i
df$AdjPvalue[sel]<-p.adjust(df$Pvalue[sel],method="BH")
}
#Now we generate the labels for signifant values
df$Significance <-
cut(
df$AdjPvalue,
breaks = c(-Inf, 0.01, 0.1, 0.2, Inf),
label = c("***", "**", "*", "")
)
df<-df[complete.cases(df),]
Env_labeller <- function(variable,value){
return(sel_vars_label[as.character(value),"Trans"])
}
df$label <- df$Taxa
```
*FIGURE 20: *
```{r crear plot 1, include=TRUE,echo=FALSE, warning=FALSE, fig.dim = c(10, 6)}
p <- ggplot(aes(x = Type, y = Taxa, fill = Correlation), data = df)
p <-
p + geom_tile() + scale_fill_gradient2(low = "#2C7BB6", mid = "white", high =
"#D7191C")
p <-
p + theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust=0.5))
p <-
p + geom_text(aes(label = Significance), color = "black", size = 3) + labs(y =
NULL, x = NULL, fill = method)
p <-
p + facet_grid(
. ~ Env,
drop = TRUE,
scale = "free",
space = "free_x",
labeller = Env_labeller
)+ theme(text=element_text(size=15))
p
```
# Microbiome + Histopathology {data-navmenu="Cross-panel Analyses"}
#### Taxonomic composition by histopathology score
Histopathology scores are condensed from a 6-point scale (0-5) to a 3-point scale (mild, moderate, severe) in order to facilitate correlations between these scores and other traits of interest. However, due to the unbalanced nature of these scores (see tables below), it can be difficult to accurately assess correlations between these scores and other traits, and any significant correlations should be viewed very carefully.
```{r open histo data, include=FALSE}
meta_exp <-complete_sample_table
```
```{r classify histo scores in categories, include=FALSE}
#as character and round up
meta_exp$OverallArchitecture <- as.character(ceiling(meta_exp$OverallArchitecture))
meta_exp$MucosalIntegrity <- as.character(ceiling(meta_exp$MucosalIntegrity))
meta_exp$LymphoidImmune <- as.character(ceiling(meta_exp$LymphoidImmune))
meta_exp$InflammationSeverity <- as.character(ceiling(meta_exp$InflammationSeverity))
meta_exp$MicrobialOrganisms <- as.character(ceiling(meta_exp$MicrobialOrganisms))
#define levels
levels = c("0", "1", "2", "3", "4", "5")
meta_exp$OverallArchitecture <- factor(meta_exp$OverallArchitecture, levels=levels)
meta_exp$MucosalIntegrity <- factor(meta_exp$MucosalIntegrity, levels=levels)
meta_exp$LymphoidImmune <- factor(meta_exp$LymphoidImmune, levels=levels)
meta_exp$InflammationSeverity <- factor(meta_exp$InflammationSeverity, levels=levels)
meta_exp$MicrobialOrganisms <- factor(meta_exp$MicrobialOrganisms, levels=levels)
#meta_exp <- meta_exp[complete.cases(meta_exp),] #keep only the ones with no missing values, im not using it because i want to keep all
#ver distribucion de los puntajes en cada score
histo <- meta_exp %>% dplyr::select(SampleID, OverallArchitecture, MucosalIntegrity, LymphoidImmune,
InflammationSeverity, MicrobialOrganisms)
rownames(histo) <- histo$SampleID
histo$SampleID <- NULL
histo %>% gather()
str(histo)
ggplot(gather(histo), aes(value)) +
stat_count(width = 0.5) +
facet_wrap(~key, scales = 'free_x')
#redistribuir/clasificar los puntajes en categorias
#note: no todas las categorias se redistribuyen de la misma manera. Estas distribuciones fueron definidas con histopatologo (Monty)
for (i in c("InflammationSeverity", "MicrobialOrganisms", "OverallArchitecture")){
print(i)
meta_exp[meta_exp[,i] %in% c("0","1"), paste0(i, "LEVEL")] <- "Mild"
meta_exp[meta_exp[,i] %in% c("2","3"), paste0(i, "LEVEL")] <- "Mod"
meta_exp[meta_exp[,i] %in% c("4","5"), paste0(i, "LEVEL")] <- "Sev"
}
meta_exp[meta_exp[,"LymphoidImmune"] %in% c("0","1", "2"), paste0("LymphoidImmune", "LEVEL")] <- "Mild"
meta_exp[meta_exp[,"LymphoidImmune"] %in% c("3","4"), paste0("LymphoidImmune", "LEVEL")] <- "Mod"
meta_exp[meta_exp[,"LymphoidImmune"] %in% c("5"), paste0("LymphoidImmune", "LEVEL")] <- "Sev"
meta_exp[meta_exp[,"MucosalIntegrity"] %in% c("0"), paste0("MucosalIntegrity", "LEVEL")] <- "Mild"
meta_exp[meta_exp[,"MucosalIntegrity"] %in% c("1","2"), paste0("MucosalIntegrity", "LEVEL")] <- "Mod"
meta_exp[meta_exp[,"MucosalIntegrity"] %in% c("3", "4", "5"), paste0("MucosalIntegrity", "LEVEL")] <- "Sev"
#Mirar otra vez la distribucion de los puntajes de cada score pero ahora en categorias
histo <- meta_exp %>% dplyr::select(OverallArchitectureLEVEL, MucosalIntegrityLEVEL, LymphoidImmuneLEVEL,
InflammationSeverityLEVEL, MicrobialOrganismsLEVEL, SampleLocation, SampleID)
rownames(histo) <- histo$SampleID
histo$SampleID <- NULL
ggplot(gather(histo), aes(value)) +
stat_count(width = 0.5) +
facet_wrap(~key, scales = 'free_x')
#create table to state how many samples there are per category in each score and location
inf= as.data.frame(table(meta_exp$InflammationSeverityLEVEL, meta_exp$SampleLocation))
inf$score= "InflammationSeverity"
micro=as.data.frame(table(meta_exp$MicrobialOrganismsLEVEL, meta_exp$SampleLocation))
micro$score= "MicrobialOrganisms"
archi=as.data.frame(table(meta_exp$OverallArchitectureLEVEL, meta_exp$SampleLocation))
archi$score= "OverallArchitecture"
lym=as.data.frame(table(meta_exp$LymphoidImmuneLEVEL, meta_exp$SampleLocation))
lym$score= "LymphoidImmune"
muco=as.data.frame(table(meta_exp$MucosalIntegrityLEVEL, meta_exp$SampleLocation))
muco$score= "MucosalIntegrity"
table_sample_count <- rbind(inf, micro,archi,lym,muco)
table_sample_count_cecum=table_sample_count[table_sample_count$Var2 == 'C',]
table_sample_count_cecum$Var2 <- NULL
table_cecum <- as.data.frame.matrix(xtabs(Freq ~ Var1 + score, table_sample_count_cecum))
table_sample_count_ileum=table_sample_count[table_sample_count$Var2 == 'I',]
table_sample_count_ileum$Var2 <- NULL
table_ileum <- as.data.frame.matrix(xtabs(Freq ~ Var1 + score, table_sample_count_ileum))
```
```{r Taxonomic composotion data open (phyloseq) + aggregate and add histo to phyloseq, include=FALSE}
#Filter phyloseq object to include only C and I samples
ODLEPobj_c_i <- subset_samples(ODLEPobj, SampleLocation%in%c("C","I"))
#Organize histo to add categories ass new variables in the existing phyloseq object
histo <- rownames_to_column(histo, var = "SampleID")
#histo$SampleType="M" #add sample type of microbiome to jpin together with microbiome data
#histo$SampleID <- apply( histo[ , c( 'SampleID' , 'SampleType' ) ] , 1 , paste , collapse = "-" )
#rownames(histo) <- histo$SampleID
#histo$SampleType<- NULL
##add histo categories as a new metadata to the phyloseq object
histo_sam <- sample_data(histo)
ps <- merge_phyloseq(ODLEPobj_c_i, histo_sam)
metadata_ps <- (sample_data(ps))
#Convert to relative conuts
ODLEPobj_rel <- microbiome::transform(ps, "compositional") #relative abundance
###aggregate taxa
taxonomy <-as.data.frame(tax_table(ODLEPobj_rel))
taxonomy <- dplyr::select(taxonomy,-c(SciName))
taxa_level <- "Genus"
phyloseq_ob_agg <-
microbiome::aggregate_taxa(ODLEPobj_rel, taxa_level)
keep <- list(unique(taxonomy[taxa_level]))
phyloseq_ob_agg <-
phyloseq::prune_taxa(unique(taxonomy %>% pull(taxa_level)), phyloseq_ob_agg)
#otu_table <- as.data.frame(otu_table(phyloseq_ob_agg)) #This is features table agg by genus with relative data
#Separate phyloseq object by location
phyloseq_ob_agg_cecum<-subset_samples(phyloseq_ob_agg, SampleLocation=="C")
phyloseq_ob_agg_ileum<-subset_samples(phyloseq_ob_agg, SampleLocation=="I")
```
*FIGURE 21:*
```{r Tax composition + histo plot cecum, include=FALSE}
#Get top taxa for graphs CECUM
otu_table <- as.data.frame(otu_table(phyloseq_ob_agg_cecum))
phyloseq_ob_agg_cecum <- get_top_taxa(phyloseq_ob_agg_cecum, 15, relative=TRUE, discard_other=FALSE,
other_label = "Other")
otu_table <- as.data.frame(otu_table(phyloseq_ob_agg_cecum))
#create dataframe from phyloseq object for CECUM
subset.genus.df_cecum <- psmelt(phyloseq_ob_agg_cecum)
subset.genus.df_cecum$genus <- as.character(subset.genus.df_cecum$Genus) #convert to character
#generate plots with for CECUM
#(mirar si se puede hacer con facets)
myplots <- list() # new empty list
catgegories= list("OverallArchitectureLEVEL","MucosalIntegrityLEVEL", "LymphoidImmuneLEVEL","InflammationSeverityLEVEL","MicrobialOrganismsLEVEL")
catgegories_labels <- c(
"OverallArchitectureLEVEL"="Overall Architecture",
"MucosalIntegrityLEVEL"="Mucosal Integrity",
"LymphoidImmuneLEVEL"="Lymphoid Immune",
"InflammationSeverityLEVEL"= "Inflammation Severity",
"MicrobialOrganismsLEVEL" = "Microbial Organisms"
)
for (i in catgegories){
print(i)
p <- ggplot(data=subset.genus.df_cecum, aes_string(x=i, y="Abundance",fill="genus")) #importat: aes as string, or if aes is not string maybe try #(x=eval(parse(text = i))
plot_abundance<- p + geom_bar(stat="identity", position="fill") +
xlab(catgegories_labels[i]) +
ylab(NULL)+
theme(text=element_text(size=10)) +
theme(axis.text.x = element_text(angle = 60, hjust = 1))
plot_abundance
myplots[[i]] <- local({
print(plot_abundance)
})
}
```
```{r print plot cecum, include=TRUE, echo=FALSE, fig.width = 10, fig.height= 6}
#hacer grid de graficas CECUM
figure_composition_cecum <- ggarrange(myplots[[1]], myplots[[2]], myplots[[3]],myplots[[4]],myplots[[5]],
ncol = 5, nrow = 1,
common.legend = TRUE,
font.label = list(size = 10, color = "black", face = "plain"))
annotate_figure(figure_composition_cecum,
top = text_grob("Taxonomic composition cecum", color = "black", face = "bold", size = 14),
left = text_grob("Relative abundance", color = "black", size = 12, rot= 90)
)
```
```{r imprimir cecum tablas con conteo de muestras, include=TRUE, echo=FALSE, warning=FALSE, fig.width = 10}
kbl(table_cecum, caption = "Table 4: Cecum- Sample count per category in each score", row.names = TRUE) %>% kable_styling(position= "left", full_width = T)
```
\
\
*FIGURE 22:*
```{r Tax composition + histo plot ileum, include=FALSE}
#Get top taxa for graphs ILEUM
otu_table <- as.data.frame(otu_table(phyloseq_ob_agg_ileum))
phyloseq_ob_agg_ileum <- get_top_taxa(phyloseq_ob_agg_ileum, 15, relative=TRUE, discard_other=FALSE,
other_label = "Other")
otu_table <- as.data.frame(otu_table(phyloseq_ob_agg_ileum))
#create dataframe from phyloseq object for ILEUM
subset.genus.df_ileum <- psmelt(phyloseq_ob_agg_ileum)
subset.genus.df_ileum$genus <- as.character(subset.genus.df_ileum$Genus) #convert to character
#generate plots with for ILEUM
myplots <- list() # new empty list
for (i in catgegories){
print(i)
p <- ggplot(data=subset.genus.df_ileum, aes_string(x=i, y="Abundance",fill="genus")) #importat: aes as string
plot_abundance<- p + geom_bar(stat="identity", position="fill") +
xlab(catgegories_labels[i]) +
ylab(NULL)+
theme(text=element_text(size=10)) +
theme(axis.text.x = element_text(angle = 60, hjust = 1))
plot_abundance
myplots[[i]] <- local({
print(plot_abundance)
})
}
```
```{r print plot ileum, include=TRUE, echo=FALSE, fig.width = 10, fig.height= 6}
#hacer grid de graficas ILEUM
figure_composition_ileum <- ggarrange(myplots[[1]], myplots[[2]], myplots[[3]],myplots[[4]],myplots[[5]],
ncol = 5, nrow = 1,
common.legend = TRUE,
font.label = list(size = 10, color = "black", face = "plain"))
annotate_figure(figure_composition_ileum,
top = text_grob("Taxonomic composition ileum", color = "black", face = "bold", size = 14),
left = text_grob("Relative abundance", color = "black", size = 12, rot= 90)
)
```
```{r imprimir ileum tablas con conteo de muestras, include=TRUE, echo=FALSE, warning=FALSE, fig.width = 10}
kbl(table_ileum, caption = "Table 5: Ileum- Sample count per category in each score", row.names = TRUE) %>% kable_styling(position= "left", full_width = T)
```
# Gene expression + Histopathology {data-navmenu="Cross-panel Analyses"}
##### Gene expression by histopathology score
Similar to the previous tab, these graphs explore the relationship between histopathology scores and another trait of interest, gene expression. Again, due to imbalances in the dataset, view any changes seen here with caution.
```{r ge histo organize, include=FALSE}
#####Open ge data#####
ge_data <- complete_sample_table
ge_data <- subset( ge_data, select = c(18, 27, 28, 29 ))
####Organize histo data####
histo_data<- as.data.frame(histo)
histo_data_class <- subset(histo_data, select = c(1:6))
histo_data_location <- subset(histo_data, select = c(1,7))
#merge dataframes ge and histo
histo_data_melted <- histo_data_class %>%
gather(Category, Score, -SampleID)
ge_data_melted <- ge_data %>%
gather(Gene, Value, -SampleID)
boxplot_data <- merge(ge_data_melted, histo_data_melted , by = "SampleID")
boxplot_data <- merge(boxplot_data, histo_data_location, by = "SampleID")
boxplot_data <- boxplot_data[!is.na(boxplot_data$Value),]
#Separate data for C and I
boxplot_data_cecum <- boxplot_data[boxplot_data$SampleLocation =="C",]
boxplot_data_ileum <- boxplot_data[boxplot_data$SampleLocation =="I",]
```
*FIGURE 23:*
Cecum
```{r ge histo plots cec, include=TRUE, echo=FALSE, fig.dim = c(10, 6)}
#cecum plot
bp <- ggplot(boxplot_data_cecum, aes(x=Score, y=Value)) +
geom_boxplot(aes(fill=Category))
bp_cecum <- bp + facet_grid(Gene~ Category)+
xlab("") + ylab("Gene expression") +
theme(legend.position = "none")
bp_cecum
```
\
*FIGURE 24:*
Ileum
```{r ge histo plots ile, include=TRUE, echo=FALSE, fig.dim = c(10, 6)}
#Ileum plot
bp <- ggplot(boxplot_data_ileum, aes(x=Score, y=Value)) +
geom_boxplot(aes(fill=Category))
bp_ileum <- bp + facet_grid(Gene~ Category)+
xlab("") + ylab("Gene expression") +
theme(legend.position = "none")
bp_ileum
```
# SIWA Ratios {data-icon="fa-table"}
**SIWA Ratios and Indexes in relation to Histopathology and Gene Expression panels.**
Below, you will find linear regressions between single traits (alpha diversity or log ratios of bacteria) and gene expression and histopathology. A regression score is calculated (R-squared), and a line is plotted to show the relationship between the traits. A slope from high to low indicates a negative relationship between the traits, while low-high indicates a positive relationship. The R-squared value suggests the size of the effect, and the tables below show the calculated p-values for each regression analysis.
**Diversity indexes:**
*Observed features* = represents richness in the sample, defined as the number of different species present in it.
*Shannon diversity* = This index measures the homogeneity in abundance of the different species in a sample. In other words, shannon index is an estimate of how complex the community is, both the number of different bacteria, and also how different these bacteria are in their function or genetics.
\
**SIWA Ratios:**
These ratios each use two major categories of bacteria in an attempt to simplify the complex microbial community into a single value that can tell us something about the state of the community. These values should be interpreted carefully, as they ignore a lot of information and do not tell a complete story. However, they may be correlated with other variables of interest in a useful way.
\
*SIWA Ratio 1*= Lactobacillus/ Escherichia-Shigella
Lactobacillus species are overwhelmingly beneficial to the host, while E coli species include commensal strains as well as opportunistic and pathogenic strains. A comparison of the two summarizes the load of both beneficial and potentially pathogenic microbes in the small intestine. A higher ratio of Lactobacillus to Escherichia-Shigella would be expected in healthier animals.
\
*SIWA Ratio 2*= Firmicutes/ Proteobacteria
Firmicutes include both beneficial (Lactobacillus, Bacillus, Ruminococcus, Lachnospiraceae, and Pediococcus) and pathogenic (Clostridium, Streptococcus, Staphylococcus, and Listeria) groups. Proteobacteria include pathogenic groups such as E coli, Salmonella, Shigella, Legionella, Vibrio and Pseudomonas. While Firmicutes may contain pathogenic bacteria as well as beneficial ones, a higher ratio of Firmicutes:Proteobacteria would be expected to be associated with better health.
\
*SIWA Ratio 3*= Lactobacillus/rest of the genera
Lactobacillus can account for as much as 90% of sequenced bacteria in the ileum of chickens. By tracking this ratio, we can understand if higher or lower levels of Lactobacillus are correlated with health and performance outcomes.
\
**Linear regressions between alpha diversity indexes and gene expression**
```{r open and organice data, include=FALSE}
#####Open data#####
data <- complete_sample_table
#####Separate by sample location#####
metadata_cecum <- data[data$SampleLocation=="C",]
metadata_ileum<- data[data$SampleLocation=="I",]
```
```{r functions, include=FALSE}
#####scatter plot function####
plot_regression <- function(x, y, metadata, x_label, y_label) {
Model <- lm(y ~ x, data = metadata)
R2= as.character(signif(summary(Model)$r.squared,digits=3))
R2= paste("R2=", R2)
plott=ggplot(metadata, aes(x=x, y=y)) +
geom_point() +
geom_smooth(formula = y ~ x, method=lm, se=FALSE)+
annotate("text", -Inf, Inf, label = R2 , hjust = 0, vjust = 1)+
xlab(x_label) + ylab(y_label)
return(plott)
}
####function to get R2####
get_stats_R <- function(x, y, metadata) {
Model <- lm(y ~ x, data = metadata)
statR= summary(Model)$r.squared
return(statR)
}
####function to get Pval of model####
get_stats_P <- function (x, y, metadata) {
modelobject<- lm(y ~ x, data = metadata)
if (class(modelobject) != "lm") stop("Not an object of class 'lm' ")
f <- summary(modelobject)$fstatistic
p <- pf(f[1],f[2],f[3],lower.tail=F)
attributes(p) <- NULL
return(p)
}
```
Row
-----------------------------------------------------------------------
### FIGURE 25 {data-width=500}
```{r aplhadiv + ge cecum, include=TRUE, echo=FALSE, warning=FALSE}
#Grid 1: CECUM
metadata= metadata_cecum
#IL10 + alphadiv
x_label= "Observed features"
y_label= "IL10"
x <- metadata$Alfa_Observed
y <- metadata$DeltaCq_IL10
cec_IL10_Alfa_Observed= plot_regression(x, y, metadata, x_label, y_label)
cec_IL10_Alfa_Observed_R=get_stats_R(x, y, metadata)
cec_IL10_Alfa_Observed_P=get_stats_P(x, y, metadata)
x_label= "Shannon diversity"
y_label= "IL10"
x <- metadata$Alfa_Shannon
y <- metadata$DeltaCq_IL10
cec_IL10_Alfa_Shannon= plot_regression(x, y, metadata, x_label, y_label)
cec_IL10_Alfa_Shannon_R=get_stats_R(x, y, metadata)
cec_IL10_Alfa_Shannon_P=get_stats_P(x, y, metadata)
#IL1B + alphadiv
x_label= "Observed features"
y_label= "IL1B"
x <- metadata$Alfa_Observed
y <- metadata$DeltaCq_IL1B
cec_IL1B_Alfa_Observed= plot_regression(x, y, metadata, x_label, y_label)
cec_IL1B_Alfa_Observed_R=get_stats_R(x, y, metadata)
cec_IL1B_Alfa_Observed_P=get_stats_P(x, y, metadata)
x_label= "Shannon diversity"
y_label= "IL1B"
x <- metadata$Alfa_Shannon
y <- metadata$DeltaCq_IL1B
cec_IL1B_Alfa_Shannon= plot_regression(x, y, metadata, x_label, y_label)
cec_IL1B_Alfa_Shannon_R=get_stats_R(x, y, metadata)
cec_IL1B_Alfa_Shannon_P=get_stats_P(x, y, metadata)
#MUC2 + alphadiv
x_label= "Observed features"
y_label= "MUC2"
x <- metadata$Alfa_Observed
y <- metadata$DeltaCq_MUC2
cec_MUC2_Alfa_Observed= plot_regression(x, y, metadata, x_label, y_label)
cec_MUC2_Alfa_Observed_R=get_stats_R(x, y, metadata)
cec_MUC2_Alfa_Observed_P=get_stats_P(x, y, metadata)
x_label= "Shannon diversity"
y_label= "MUC2"
x <- metadata$Alfa_Shannon
y <- metadata$DeltaCq_MUC2
cec_MUC2_Alfa_Shannon= plot_regression(x, y, metadata, x_label, y_label)
cec_MUC2_Alfa_Shannon_R=get_stats_R(x, y, metadata)
cec_MUC2_Alfa_Shannon_P=get_stats_P(x, y, metadata)
gird_cecum=ggarrange(cec_IL10_Alfa_Observed, cec_IL10_Alfa_Shannon,
cec_IL1B_Alfa_Observed, cec_IL1B_Alfa_Shannon,
cec_MUC2_Alfa_Observed,cec_MUC2_Alfa_Shannon,
ncol = 2, nrow = 3)
gird_cecum_gene_aplhadiv= annotate_figure(gird_cecum, top = text_grob("Cecum",
color = "black", face = "bold", size = 14))
gird_cecum_gene_aplhadiv
```
### {data-width=500}
```{r aplhadiv + ge cecum tab, include=TRUE, echo=FALSE, warning=FALSE}
tab <- matrix(c(cec_IL10_Alfa_Observed_R, cec_IL10_Alfa_Observed_P,
cec_IL10_Alfa_Shannon_R, cec_IL10_Alfa_Shannon_P,
cec_IL1B_Alfa_Observed_R, cec_IL1B_Alfa_Observed_P,
cec_IL1B_Alfa_Shannon_R, cec_IL1B_Alfa_Shannon_P,
cec_MUC2_Alfa_Observed_R, cec_MUC2_Alfa_Observed_P,
cec_MUC2_Alfa_Shannon_R,cec_MUC2_Alfa_Shannon_P), ncol=2, byrow=TRUE)
colnames(tab) <- c('R2','Pval')
rownames(tab) <- c('IL10_Observed','IL10_Shannon',
'IL1B_Observed', 'IL1B_Shannon',
'MUC2_Observed', 'MUC2_Shannon')
table_cecum_gene_aplhadiv <- as.table(tab)
kbl(table_cecum_gene_aplhadiv, caption = "Ileum- Statistics for the previous regressions", row.names = TRUE) %>% kable_styling(position= "left", full_width = T)
```
Row
-----------------------------------------------------------------------
### FIGURE 26 {data-width=500}
```{r alphadiv + ge ileum, include=TRUE, echo=FALSE, warning=FALSE}
#Grid 2: ILEUM
metadata= metadata_ileum
#IL10 + alphadiv
x_label= "Observed features"
y_label= "IL10"
x <- metadata$Alfa_Observed
y <- metadata$ratio1LOG
ile_IL10_Alfa_Observed= plot_regression(x, y, metadata, x_label, y_label)
ile_IL10_Alfa_Observed_R=get_stats_R(x, y, metadata)
ile_IL10_Alfa_Observed_P=get_stats_P(x, y, metadata)
x_label= "Shannon diversity"
y_label= "IL10"
x <- metadata$Alfa_Shannon
y <- metadata$DeltaCq_IL10
ile_IL10_Alfa_Shannon= plot_regression(x, y, metadata, x_label, y_label)
ile_IL10_Alfa_Shannon_R=get_stats_R(x, y, metadata)
ile_IL10_Alfa_Shannon_P=get_stats_P(x, y, metadata)
#IL1B + alphadiv
x_label= "Observed features"
y_label= "IL1B"
x <- metadata$Alfa_Observed
y <- metadata$DeltaCq_IL1B
ile_IL1B_Alfa_Observed= plot_regression(x, y, metadata, x_label, y_label)
ile_IL1B_Alfa_Observed_R=get_stats_R(x, y, metadata)
ile_IL1B_Alfa_Observed_P=get_stats_P(x, y, metadata)
x_label= "Shannon diversity"
y_label= "IL1B"
x <- metadata$Alfa_Shannon
y <- metadata$DeltaCq_IL1B
ile_IL1B_Alfa_Shannon= plot_regression(x, y, metadata, x_label, y_label)
ile_IL1B_Alfa_Shannon_R=get_stats_R(x, y, metadata)
ile_IL1B_Alfa_Shannon_P=get_stats_P(x, y, metadata)
#MUC2 + alphadiv
x_label= "Observed features"
y_label= "MUC2"
x <- metadata$Alfa_Observed
y <- metadata$DeltaCq_MUC2
ile_MUC2_Alfa_Observed= plot_regression(x, y, metadata, x_label, y_label)
ile_MUC2_Alfa_Observed_R=get_stats_R(x, y, metadata)
ile_MUC2_Alfa_Observed_P=get_stats_P(x, y, metadata)
x_label= "Shannon diversity"
y_label= "MUC2"
x <- metadata$Alfa_Shannon
y <- metadata$DeltaCq_MUC2
ile_MUC2_Alfa_Shannon= plot_regression(x, y, metadata, x_label, y_label)
ile_MUC2_Alfa_Shannon_R=get_stats_R(x, y, metadata)
ile_MUC2_Alfa_Shannon_P=get_stats_P(x, y, metadata)
gird_ileum=ggarrange(ile_IL10_Alfa_Observed, ile_IL10_Alfa_Shannon,
ile_IL1B_Alfa_Observed, ile_IL1B_Alfa_Shannon,
ile_MUC2_Alfa_Observed,ile_MUC2_Alfa_Shannon,
ncol = 2, nrow = 3)
gird_ileum_gene_aplhadiv= annotate_figure(gird_ileum, top = text_grob("Ileum",
color = "black", face = "bold", size = 14))
gird_ileum_gene_aplhadiv
```
### {data-width=500}
```{r alphadiv + ge ileum tab, include=TRUE, echo=FALSE, warning=FALSE}
tab <- matrix(c(ile_IL10_Alfa_Observed_R, ile_IL10_Alfa_Observed_P,
ile_IL10_Alfa_Shannon_R, ile_IL10_Alfa_Shannon_P,
ile_IL1B_Alfa_Observed_R, ile_IL1B_Alfa_Observed_P,
ile_IL1B_Alfa_Shannon_R, ile_IL1B_Alfa_Shannon_P,
ile_MUC2_Alfa_Observed_R, ile_MUC2_Alfa_Observed_P,
ile_MUC2_Alfa_Shannon_R,ile_MUC2_Alfa_Shannon_P), ncol=2, byrow=TRUE)
colnames(tab) <- c('R2','Pval')
rownames(tab) <- c('IL10_Observed','IL10_Shannon',
'IL1B_Observed', 'IL1B_Shannon',
'MUC2_Observed', 'MUC2_Shannon')
table_ileum_gene_aplhadiv <- as.table(tab)
kbl(table_ileum_gene_aplhadiv, caption = "Ileum- Statistics for the previous regressions", row.names = TRUE) %>% kable_styling(position= "left", full_width = T)
```
Row {data-height=50}
-----------------------------------------------------------------------
Row {data-height=30}
-----------------------------------------------------------------------
**Ratios and gene expression**
Row {data-height=30}
-----------------------------------------------------------------------
Linear regressions between microbiome ratios and gene expression
Row
-----------------------------------------------------------------------
### FIGURE 27 {data-width=500}
```{r ratios + ge cecum, include=TRUE, echo=FALSE, warning=FALSE}
#Grid 1: CECUM
metadata= metadata_cecum
#IL10 + Ratios
x_label= "Ratio1"
y_label= "IL10"
x <- metadata$ratio1LOG
y <- metadata$DeltaCq_IL10
cec_IL10_Ratio1= plot_regression(x, y, metadata, x_label, y_label)
cec_IL10_Ratio1_R=get_stats_R(x, y, metadata)
cec_IL10_Ratio1_P=get_stats_P(x, y, metadata)
x_label= "Ratio2"
y_label= "IL10"
x <- metadata$ratio2LOG
y <- metadata$DeltaCq_IL10
cec_IL10_Ratio2= plot_regression(x, y, metadata, x_label, y_label)
cec_IL10_Ratio2_R=get_stats_R(x, y, metadata)
cec_IL10_Ratio2_P=get_stats_P(x, y, metadata)
x_label= "Ratio3"
y_label= "IL10"
x <- metadata$ratio3LOG
y <- metadata$DeltaCq_IL10
cec_IL10_Ratio3= plot_regression(x, y, metadata, x_label, y_label)
cec_IL10_Ratio3_R=get_stats_R(x, y, metadata)
cec_IL10_Ratio3_P=get_stats_P(x, y, metadata)
#IL1B + Ratios
x_label= "Ratio1"
y_label= "IL1B"
x <- metadata$ratio1LOG
y <- metadata$DeltaCq_IL1B
cec_IL1B_Ratio1= plot_regression(x, y, metadata, x_label, y_label)
cec_IL1B_Ratio1_R=get_stats_R(x, y, metadata)
cec_IL1B_Ratio1_P=get_stats_P(x, y, metadata)
x_label= "Ratio2"
y_label= "IL1B"
x <- metadata$ratio2LOG
y <- metadata$DeltaCq_IL1B
cec_IL1B_Ratio2= plot_regression(x, y, metadata, x_label, y_label)
cec_IL1B_Ratio2_R=get_stats_R(x, y, metadata)
cec_IL1B_Ratio2_P=get_stats_P(x, y, metadata)
x_label= "Ratio3"
y_label= "IL1B"
x <- metadata$ratio3LOG
y <- metadata$DeltaCq_IL1B
cec_IL1B_Ratio3= plot_regression(x, y, metadata, x_label, y_label)
cec_IL1B_Ratio3_R=get_stats_R(x, y, metadata)
cec_IL1B_Ratio3_P=get_stats_P(x, y, metadata)
#MUC2 + Ratios
x_label= "Ratio1"
y_label= "MUC2"
x <- metadata$ratio1LOG
y <- metadata$DeltaCq_MUC2
cec_MUC2_Ratio1= plot_regression(x, y, metadata, x_label, y_label)
cec_MUC2_Ratio1_R=get_stats_R(x, y, metadata)
cec_MUC2_Ratio1_P=get_stats_P(x, y, metadata)
x_label= "Ratio2"
y_label= "MUC2"
x <- metadata$ratio2LOG
y <- metadata$DeltaCq_MUC2
cec_MUC2_Ratio2= plot_regression(x, y, metadata, x_label, y_label)
cec_MUC2_Ratio2_R=get_stats_R(x, y, metadata)
cec_MUC2_Ratio2_P=get_stats_P(x, y, metadata)
x_label= "Ratio3"
y_label= "MUC2"
x <- metadata$ratio3LOG
y <- metadata$DeltaCq_MUC2
cec_MUC2_Ratio3= plot_regression(x, y, metadata, x_label, y_label)
cec_MUC2_Ratio3_R=get_stats_R(x, y, metadata)
cec_MUC2_Ratio3_P=get_stats_P(x, y, metadata)
gird_cecum=ggarrange(cec_IL10_Ratio1, cec_IL10_Ratio2, cec_IL10_Ratio3,
cec_IL1B_Ratio1,cec_IL1B_Ratio2, cec_IL1B_Ratio3,
cec_MUC2_Ratio1, cec_MUC2_Ratio2, cec_MUC2_Ratio3,
ncol = 3, nrow = 3)
gird_cecum_gene_ratios= annotate_figure(gird_cecum, top = text_grob("Cecum",
color = "black", face = "bold"))
gird_cecum_gene_ratios
```
### {data-width=500}
```{r ratios + ge cecum tabb, include=TRUE, echo=FALSE, warning=FALSE}
tab <- matrix(c(cec_IL10_Ratio1_R, cec_IL10_Ratio1_P,
cec_IL10_Ratio2_R, cec_IL10_Ratio2_P,
cec_IL10_Ratio3_R, cec_IL10_Ratio3_P,
cec_IL1B_Ratio1_R, cec_IL1B_Ratio1_P,
cec_IL1B_Ratio2_R, cec_IL1B_Ratio2_P,
cec_IL1B_Ratio3_R, cec_IL1B_Ratio3_P,
cec_MUC2_Ratio1_R, cec_MUC2_Ratio1_P,
cec_MUC2_Ratio2_R, cec_MUC2_Ratio2_P,
cec_MUC2_Ratio3_R, cec_MUC2_Ratio3_P), ncol=2, byrow=TRUE)
colnames(tab) <- c('R2','Pval')
rownames(tab) <- c('IL10_Ratio1','IL10_Ratio2', 'IL10_Ratio3',
'IL1B_Ratio1', 'IL1B_Ratio2', 'IL1B_Ratio3',
'MUC2_Ratio1', 'MUC2_Ratio2', 'MUC2_Ratio3')
table_cecum_gene_ratios<- as.table(tab)
kbl(table_cecum_gene_ratios, caption = "Ileum- Statistics for the previous regressions", row.names = TRUE) %>% kable_styling(position= "left", full_width = T)
```
Row
-----------------------------------------------------------------------
### FIGURE 28 {data-width=500}
```{r ratios + ge ileum, include=TRUE, echo=FALSE, warning=FALSE}
#Grid 2: ILEUM
metadata= metadata_ileum
#IL10 + Ratios
x_label= "Ratio1"
y_label= "IL10"
x <- metadata$ratio1LOG
y <- metadata$DeltaCq_IL10
ile_IL10_Ratio1= plot_regression(x, y, metadata, x_label, y_label)
ile_IL10_Ratio1_R=get_stats_R(x, y, metadata)
ile_IL10_Ratio1_P=get_stats_P(x, y, metadata)
x_label= "Ratio2"
y_label= "IL10"
x <- metadata$ratio2LOG
y <- metadata$DeltaCq_IL10
ile_IL10_Ratio2= plot_regression(x, y, metadata, x_label, y_label)
ile_IL10_Ratio2_R=get_stats_R(x, y, metadata)
ile_IL10_Ratio2_P=get_stats_P(x, y, metadata)
x_label= "Ratio3"
y_label= "IL10"
x <- metadata$ratio3LOG
y <- metadata$DeltaCq_IL10
ile_IL10_Ratio3= plot_regression(x, y, metadata, x_label, y_label)
ile_IL10_Ratio3_R=get_stats_R(x, y, metadata)
ile_IL10_Ratio3_P=get_stats_P(x, y, metadata)
#IL1B + Ratios
x_label= "Ratio1"
y_label= "IL1B"
x <- metadata$ratio1LOG
y <- metadata$DeltaCq_IL1B
ile_IL1B_Ratio1= plot_regression(x, y, metadata, x_label, y_label)
ile_IL1B_Ratio1_R=get_stats_R(x, y, metadata)
ile_IL1B_Ratio1_P=get_stats_P(x, y, metadata)
x_label= "Ratio2"
y_label= "IL1B"
x <- metadata$ratio2LOG
y <- metadata$DeltaCq_IL1B
ile_IL1B_Ratio2= plot_regression(x, y, metadata, x_label, y_label)
ile_IL1B_Ratio2_R=get_stats_R(x, y, metadata)
ile_IL1B_Ratio2_P=get_stats_P(x, y, metadata)
x_label= "Ratio3"
y_label= "IL1B"
x <- metadata$ratio3LOG
y <- metadata$DeltaCq_IL1B
ile_IL1B_Ratio3= plot_regression(x, y, metadata, x_label, y_label)
ile_IL1B_Ratio3_R=get_stats_R(x, y, metadata)
ile_IL1B_Ratio3_P=get_stats_P(x, y, metadata)
#MUC2 + Ratios
x_label= "Ratio1"
y_label= "MUC2"
x <- metadata$ratio1LOG
y <- metadata$DeltaCq_MUC2
ile_MUC2_Ratio1= plot_regression(x, y, metadata, x_label, y_label)
ile_MUC2_Ratio1_R=get_stats_R(x, y, metadata)
ile_MUC2_Ratio1_P=get_stats_P(x, y, metadata)
x_label= "Ratio2"
y_label= "MUC2"
x <- metadata$ratio2LOG
y <- metadata$DeltaCq_MUC2
ile_MUC2_Ratio2= plot_regression(x, y, metadata, x_label, y_label)
ile_MUC2_Ratio2_R=get_stats_R(x, y, metadata)
ile_MUC2_Ratio2_P=get_stats_P(x, y, metadata)
x_label= "Ratio3"
y_label= "MUC2"
x <- metadata$ratio3LOG
y <- metadata$DeltaCq_MUC2
ile_MUC2_Ratio3= plot_regression(x, y, metadata, x_label, y_label)
ile_MUC2_Ratio3_R=get_stats_R(x, y, metadata)
ile_MUC2_Ratio3_P=get_stats_P(x, y, metadata)
gird_ileum=ggarrange(ile_IL10_Ratio1, ile_IL10_Ratio2, ile_IL10_Ratio3,
ile_IL1B_Ratio1,ile_IL1B_Ratio2, ile_IL1B_Ratio3,
ile_MUC2_Ratio1, ile_MUC2_Ratio2, ile_MUC2_Ratio3,
ncol = 3, nrow = 3)
gird_ileum_gene_ratios= annotate_figure(gird_ileum, top = text_grob("Ileum",
color = "black", face = "bold"))
gird_ileum_gene_ratios
```
### {data-width=500}
```{r ratios + ge ileum tabb, include=TRUE, echo=FALSE, warning=FALSE}
tab <- matrix(c(ile_IL10_Ratio1_R, ile_IL10_Ratio1_P,
ile_IL10_Ratio2_R, ile_IL10_Ratio2_P,
ile_IL10_Ratio3_R, ile_IL10_Ratio3_P,
ile_IL1B_Ratio1_R, ile_IL1B_Ratio1_P,
ile_IL1B_Ratio2_R, ile_IL1B_Ratio2_P,
ile_IL1B_Ratio3_R, ile_IL1B_Ratio3_P,
ile_MUC2_Ratio1_R, ile_MUC2_Ratio1_P,
ile_MUC2_Ratio2_R, ile_MUC2_Ratio2_P,
ile_MUC2_Ratio3_R, ile_MUC2_Ratio3_P), ncol=2, byrow=TRUE)
colnames(tab) <- c('R2','Pval')
rownames(tab) <- c('IL10_Ratio1','IL10_Ratio2', 'IL10_Ratio3',
'IL1B_Ratio1', 'IL1B_Ratio2', 'IL1B_Ratio3',
'MUC2_Ratio1', 'MUC2_Ratio2', 'MUC2_Ratio3')
table_ileum_gene_ratios<- as.table(tab)
kbl(table_ileum_gene_ratios, caption = "Ileum- Statistics for the previous regressions", row.names = TRUE) %>% kable_styling(position= "left", full_width = T)
```
Row {data-height=50}
-----------------------------------------------------------------------
Row {data-height=30}
-----------------------------------------------------------------------
**Diversity indexes by Histopathology score**
```{r Indexes histo organize, include=FALSE}
#####Open ratio data#####
alpha_index_data <- complete_sample_table
alpha_index_data <- subset( alpha_index_data, select = c(18, 37,38 ))
#merge dataframes ratios and histo
histo_data_melted <- histo_data_class %>%
gather(Category, Score, -SampleID)
alpha_index_data <- alpha_index_data %>%
gather(Index, Value, -SampleID)
boxplot_data <- merge(alpha_index_data, histo_data_melted , by = "SampleID")
boxplot_data <- merge(boxplot_data, histo_data_location, by = "SampleID")
#Separate data for C and I
boxplot_data_cecum <- boxplot_data[boxplot_data$SampleLocation =="C",]
boxplot_data_ileum <- boxplot_data[boxplot_data$SampleLocation =="I",]
```
Row
-----------------------------------------------------------------------
#### FIGURE 29 {data-width=1000}
Cecum
```{r Indexes histo plots cec, include=TRUE, echo=FALSE}
#cecum plot
bp <- ggplot(boxplot_data_cecum, aes(x=Score, y=Value)) +
geom_boxplot(aes(fill=Category))
bp_cecum <- bp + facet_grid(Index~ Category, scales = "free")+
xlab("") + ylab("Ratio Value") +
theme(legend.position = "none")
bp_cecum
```
Row {data-height=80}
-----------------------------------------------------------------------
Row
-----------------------------------------------------------------------
#### FIGURE 30 {data-width=1000}
Ileum
```{r Indexes histo plots ile, include=TRUE, echo=FALSE}
#Ileum plot
bp <- ggplot(boxplot_data_ileum, aes(x=Score, y=Value)) +
geom_boxplot(aes(fill=Category))
bp_ileum <- bp + facet_grid(Index~ Category, scales = "free")+
xlab("") + ylab("Ratio Value") +
theme(legend.position = "none")
bp_ileum
```
Row {data-height=80}
-----------------------------------------------------------------------
Row {data-height=30}
-----------------------------------------------------------------------
**Microbiome Ratios by Histopathology score **
```{r ratios histo organize, include=FALSE}
#####Open ratio data#####
ratios_data <- complete_sample_table
ratios_data <- subset( ratios_data, select = c(18, 20, 22, 24 ))
#merge dataframes ratios and histo
histo_data_melted <- histo_data_class %>%
gather(Category, Score, -SampleID)
ratios_data_melted <- ratios_data %>%
gather(Ratio, Value, -SampleID)
boxplot_data <- merge(ratios_data_melted, histo_data_melted , by = "SampleID")
boxplot_data <- merge(boxplot_data, histo_data_location, by = "SampleID")
#Separate data for C and I
boxplot_data_cecum <- boxplot_data[boxplot_data$SampleLocation =="C",]
boxplot_data_ileum <- boxplot_data[boxplot_data$SampleLocation =="I",]
```
Row
-----------------------------------------------------------------------
#### FIGURE 31
Cecum
```{r ratios histo plots cec, include=TRUE, echo=FALSE}
#cecum plot
bp <- ggplot(boxplot_data_cecum, aes(x=Score, y=Value)) +
geom_boxplot(aes(fill=Category))
bp_cecum <- bp + facet_grid(Ratio~ Category)+
xlab("") + ylab("Ratio Value") +
theme(legend.position = "none")
bp_cecum
```
Row {data-height=80}
-----------------------------------------------------------------------
Row
-----------------------------------------------------------------------
#### FIGURE 32
Ileum
```{r ratios histo plots ile, include=TRUE, echo=FALSE}
#Ileum plot
bp <- ggplot(boxplot_data_ileum, aes(x=Score, y=Value)) +
geom_boxplot(aes(fill=Category))
bp_ileum <- bp + facet_grid(Ratio~ Category)+
xlab("") + ylab("Ratio Value") +
theme(legend.position = "none") +
theme(text=element_text(size=15))
bp_ileum
```